Commands useful for users of a hadoop.
1. appendToFile
Usage: hdfs dfs -appendToFile <localsrc> … <dst>
Append single src, or multiple srcs from local file system to the destination file system. Also reads input from stdin and appends to destination file system.
hdfs dfs -appendToFile localfile /user/hadoop/hadoopfile
hdfs dfs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile
hdfs dfs -appendToFile localfile hdfs://nn.example.com/hadoop/hadoopfile
hdfs dfs -appendToFile – hdfs://nn.example.com/hadoop/hadoopfile Reads the input from stdin.
Exit Code:
Returns 0 on success and 1 on error.
2. cat
Usage: hdfs dfs -cat URI [URI …]
Copies source paths to stdout.
Example:
hdfs dfs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2
hdfs dfs -cat file:///file3 /user/hadoop/file4
Exit Code:
Returns 0 on success and -1 on error.
3. chmod
Usage: hdfs dfs -chmod [-R] <MODE[,MODE]… | OCTALMODE> URI [URI …]
Change the permissions of files. With -R, make the change recursively through the directory structure. The user must be the owner of the file, or else a super-user. Additional information is in the Permissions Guide.
Options
The -R option will make the change recursively through the directory structure.
4. chown
Usage: hdfs dfs -chown [-R] [OWNER][:[GROUP]] URI [URI ]
Change the owner of files. The user must be a super-user. Additional information is in the Permissions Guide.
Options
The -R option will make the change recursively through the directory structure.
5. copyFromLocal
Usage: hdfs dfs -copyFromLocal <localsrc> URI
Similar to put command, except that the source is restricted to a local file reference.
Options:
The -f option will overwrite the destination if it already exists.
6. copyToLocal
Usage: hdfs dfs -copyToLocal [-ignorecrc] [-crc] URI <localdst>
Similar to get command, except that the destination is restricted to a local file reference.
7. count
Usage: hdfs dfs -count [-q] [-h] <paths>
Count the number of directories, files and bytes under the paths that match the specified file pattern. The output columns with -count are: DIR_COUNT, FILE_COUNT, CONTENT_SIZE FILE_NAME
The output columns with -count -q are: QUOTA, REMAINING_QUATA, SPACE_QUOTA, REMAINING_SPACE_QUOTA, DIR_COUNT, FILE_COUNT, CONTENT_SIZE, FILE_NAME
The -h option shows sizes in human readable format.
Example:
hdfs dfs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2
hdfs dfs -count -q hdfs://nn1.example.com/file1
hdfs dfs -count -q -h hdfs://nn1.example.com/file1
Exit Code:
Returns 0 on success and -1 on error.
8. cp
Usage: hdfs dfs -cp [-f] [-p | -p[topax]] URI [URI …] <dest>
Copy files from source to destination. This command allows multiple sources as well in which case the destination must be a directory.
‘raw.*’ namespace extended attributes are preserved if (1) the source and destination filesystems support them (HDFS only), and (2) all source and destination pathnames are in the /.reserved/raw hierarchy. Determination of whether raw.* namespace xattrs are preserved is independent of the -p (preserve) flag.
Options:
The -f option will overwrite the destination if it already exists.
The -p option will preserve file attributes [topx] (timestamps, ownership, permission, ACL, XAttr). If -p is specified with no arg, then preserves timestamps, ownership, permission. If -pa is specified, then preserves permission also because ACL is a super-set of permission. Determination of whether raw namespace extended attributes are preserved is independent of the -p flag.
Example:
hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2
hdfs dfs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir
Exit Code:
Returns 0 on success and -1 on error.
9. du
Usage: hdfs dfs -du [-s] [-h] URI [URI …]
Displays sizes of files and directories contained in the given directory or the length of a file in case its just a file.
Options:
The -s option will result in an aggregate summary of file lengths being displayed, rather than the individual files.
The -h option will format file sizes in a “human-readable” fashion (e.g 64.0m instead of 67108864)
Example:
hdfs dfs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1
Exit Code: Returns 0 on success and -1 on error.
10. get
Usage: hdfs dfs -get [-ignorecrc] [-crc] <src> <localdst>
Copy files to the local file system. Files that fail the CRC check may be copied with the -ignorecrc option. Files and CRCs may be copied using the -crc option.
Example:
hdfs dfs -get /user/hadoop/file localfile
hdfs dfs -get hdfs://nn.example.com/user/hadoop/file localfile
Exit Code:
Returns 0 on success and -1 on error.
11. ls
Usage: hdfs dfs -ls [-R] <args>
Options:
The -R option will return stat recursively through the directory structure.
For a file returns stat on the file with the following format:
permissions number_of_replicas userid groupid filesize modification_date modification_time filename
For a directory it returns list of its direct children as in Unix. A directory is listed as:
permissions userid groupid modification_date modification_time dirname
Example:
hdfs dfs -ls /user/hadoop/file1
Exit Code:
Returns 0 on success and -1 on error.
12. lsr
Usage: hdfs dfs -lsr <args>
Recursive version of ls.
Note: This command is deprecated. Instead use hdfs dfs -ls -R
13. mkdir
Usage: hdfs dfs -mkdir [-p] <paths>
Takes path uri’s as argument and creates directories.
Options:
The -p option behavior is much like Unix mkdir -p, creating parent directories along the path.
Example:
hdfs dfs -mkdir /user/hadoop/dir1 /user/hadoop/dir2
hdfs dfs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir
Exit Code:
Returns 0 on success and -1 on error.
13. moveFromLocal
Usage: hdfs dfs -moveFromLocal <localsrc> <dst>
Similar to put command, except that the source localsrc is deleted after it’s copied.
14. moveToLocal
Usage: hdfs dfs -moveToLocal [-crc] <src> <dst>
Displays a “Not implemented yet” message.
15. mv
Usage: hdfs dfs -mv URI [URI …] <dest>
Moves files from source to destination. This command allows multiple sources as well in which case the destination needs to be a directory. Moving files across file systems is not permitted.
Example:
hdfs dfs -mv /user/hadoop/file1 /user/hadoop/file2
hdfs dfs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1
Exit Code:
Returns 0 on success and -1 on error.
16. put
Usage: hdfs dfs -put <localsrc> … <dst>
Copy single src, or multiple srcs from local file system to the destination file system. Also reads input from stdin and writes to destination file system.
hdfs dfs -put localfile /user/hadoop/hadoopfile
hdfs dfs -put localfile1 localfile2 /user/hadoop/hadoopdir
hdfs dfs -put localfile hdfs://nn.example.com/hadoop/hadoopfile
hdfs dfs -put – hdfs://nn.example.com/hadoop/hadoopfile Reads the input from stdin.
Exit Code:
Returns 0 on success and -1 on error.
17. rm
Usage: hdfs dfs -rm [-f] [-r|-R] [-skipTrash] URI [URI …]
Delete files specified as args.
Options:
The -f option will not display a diagnostic message or modify the exit status to reflect an error if the file does not exist.
The -R option deletes the directory and any content under it recursively.
The -r option is equivalent to -R.
The -skipTrash option will bypass trash, if enabled, and delete the specified file(s) immediately. This can be useful when it is necessary to delete files from an over-quota directory.
Example:
hdfs dfs -rm hdfs://nn.example.com/file /user/hadoop/emptydir
Exit Code:
Returns 0 on success and -1 on error.
18. rmr
Usage: hdfs dfs -rmr [-skipTrash] URI [URI …]
Recursive version of delete.
Note: This command is deprecated. Instead use hdfs dfs -rm -r
19. text
Usage: hdfs dfs -text <src>
Takes a source file and outputs the file in text format. The allowed formats are zip and TextRecordInputStream.
20. touchz
Usage: hdfs dfs -touchz URI [URI …]
Create a file of zero length.
Example:
hdfs dfs -touchz pathname
Exit Code: Returns 0 on success and -1 on error.
Thanks 🙂