Hadoop HDFS Operations

Starting HDFS

During the initial logon to the configured HDFS file system, access the name node (HDFS server) and execute the command below to format it.

$ hadoop namenode -format 

After formatting, you need to start the distributed file system. Use the command to start the name node and the data nodes -

$ start-dfs.sh 

Listing Files in HDFS

To list files in a directory, use the 'ls' command. The following syntax can be used with 'ls' to list all files in a directory or specify a filename as an argument -

$HADOOP_HOME/bin/hadoop fs -ls <args>

Create Directory

It will take a list of URI paths as arguments and create the specified directory or directories -

$HADOOP_HOME/bin/hadoop fs -mkdir <paths>

Space utilization in a HDFS dir.

hadoop fs -du: Displays the sizes of files and directories in the specified directory, or the size of a file if only a single file is coded -

$HADOOP_HOME/bin/hadoop fs -du URI

Upload files

Copy one or multiple source files from the local file system to the Hadoop Distributed File System -

$HADOOP_HOME/bin/hadoop fs -put <localsrc>
 ... <HDFS_dest_Path>
  • <localsrc> -- Local source path
  • <HDFS_dest_Path> -- HDFS destination path

Download files

Copies or Downloads files to the local file system.

$HADOOP_HOME/bin/hadoop fs -get <hdfs_src> <localdst> 
  • <hdfs_src> -- HDFS source path
  • <localdst> -- Local Destination to copy

Getting help:

Help command to get list of commands supported by Hadoop Data File System(HDFS).

$HADOOP_HOME/bin/hadoop fs -help

Inserting Data into HDFS

The file named filename.txt on the local system needs to be uploaded to the HDFS file system. Follow the steps below to upload the necessary file to the Hadoop file system -

Step-1: - Create an input directory.

$HADOOP_HOME/bin/hadoop fs -mkdir /usr/input

Step-2: - Transfer a data file from local systems to the Hadoop file system using the put command -

$HADOOP_HOME/bin/hadoop fs -put /home/filename.txt /usr/input 

Step-3: - Verify the file using ls command -

$HADOOP_HOME/bin/hadoop fs -ls /usr/input

Retrieving Data from HDFS

A file named "outfile" located in HDFS needs to be downloaded from the Hadoop file system. Below are the steps to do so -

Step-1: - Initially, view the data from HDFS using cat command.

$HADOOP_HOME/bin/hadoop fs -cat /usr/output/outfile 

Step-2: - Get the file from HDFS to the local file system using get command.

$HADOOP_HOME/bin/hadoop fs -get /usr/output/ /home/download/ 

Shutting Down the HDFS

Shut down the HDFS by using the following command -

$ stop-dfs.sh