Hadoop HDFS Operations
Starting HDFS
During the initial logon to the configured HDFS file system, access the name node (HDFS server) and execute the command below to format it.
$ hadoop namenode -format
After formatting, you need to start the distributed file system. Use the command to start the name node and the data nodes -
$ start-dfs.sh
Listing Files in HDFS
To list files in a directory, use the 'ls' command. The following syntax can be used with 'ls' to list all files in a directory or specify a filename as an argument -
$HADOOP_HOME/bin/hadoop fs -ls <args>
Create Directory
It will take a list of URI paths as arguments and create the specified directory or directories -
$HADOOP_HOME/bin/hadoop fs -mkdir <paths>
Space utilization in a HDFS dir.
hadoop fs -du: Displays the sizes of files and directories in the specified directory, or the size of a file if only a single file is coded -
$HADOOP_HOME/bin/hadoop fs -du URI
Upload files
Copy one or multiple source files from the local file system to the Hadoop Distributed File System -
$HADOOP_HOME/bin/hadoop fs -put <localsrc>
... <HDFS_dest_Path>
- <localsrc> -- Local source path
- <HDFS_dest_Path> -- HDFS destination path
Download files
Copies or Downloads files to the local file system.
$HADOOP_HOME/bin/hadoop fs -get <hdfs_src> <localdst>
- <hdfs_src> -- HDFS source path
- <localdst> -- Local Destination to copy
Getting help:
Help command to get list of commands supported by Hadoop Data File System(HDFS).
$HADOOP_HOME/bin/hadoop fs -help
Inserting Data into HDFS
The file named filename.txt on the local system needs to be uploaded to the HDFS file system. Follow the steps below to upload the necessary file to the Hadoop file system -
Step-1: - Create an input directory.
$HADOOP_HOME/bin/hadoop fs -mkdir /usr/input
Step-2: - Transfer a data file from local systems to the Hadoop file system using the put command -
$HADOOP_HOME/bin/hadoop fs -put /home/filename.txt /usr/input
Step-3: - Verify the file using ls command -
$HADOOP_HOME/bin/hadoop fs -ls /usr/input
Retrieving Data from HDFS
A file named "outfile" located in HDFS needs to be downloaded from the Hadoop file system. Below are the steps to do so -
Step-1: - Initially, view the data from HDFS using cat command.
$HADOOP_HOME/bin/hadoop fs -cat /usr/output/outfile
Step-2: - Get the file from HDFS to the local file system using get command.
$HADOOP_HOME/bin/hadoop fs -get /usr/output/ /home/download/
Shutting Down the HDFS
Shut down the HDFS by using the following command -
$ stop-dfs.sh