Thursday, 17 October 2013

Hadoop Commands

hadoop command [genericOptions] [commandOptions]

hadoop fs
Usage: java FsShell
 [-ls <path>]
 [-lsr <path>]
 [-df [<path>]]
 [-du <path>]
 [-dus <path>]
 [-count[-q] <path>]
 [-mv <src> <dst>]
 [-cp <src> <dst>]
 [-rm [-skipTrash] <path>]
 [-rmr [-skipTrash] <path>]
 [-put <localsrc> ... <dst>]
 [-copyFromLocal <localsrc> ... <dst>]
 [-moveFromLocal <localsrc> ... <dst>]
 [-get [-ignoreCrc] [-crc] <src> <localdst>]
 [-getmerge <src> <localdst> [addnl]]
 [-cat <src>]
 [-text <src>]
 [-copyToLocal [-ignoreCrc] [-crc] <src> <localdst>]
 [-moveToLocal [-crc] <src> <localdst>]
 [-mkdir <path>]
 [-setrep [-R] [-w] <rep> <path/file>]
 [-touchz <path>]
 [-test -[ezd] <path>]
 [-stat [format] <path>]
 [-tail [-f] <file>]
 [-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...]
 [-chown [-R] [OWNER][:[GROUP]] PATH...]
 [-chgrp [-R] GROUP PATH...]
 [-help [cmd]]

hadoop fs -ls /
hadoop fs -ls /test1/
hadoop fs -cat /test1/foo.txt
hadoop fs -rm /test1/foo.txt
hadoop fs -mkdir /test6/
hadoop fs -put foo.txt /test6/
hadoop fs -put /etc/note.txt /test2/note_fs.txt
hadoop fs -get /user/satya/passwd ./
hadoop fs -setrep 5 -R /user/satya/tmp/
hadoop fsck /user/satya/tmp -files -blocks -locations
hadoop fs -chmod 1777 /tmp
hadoop fs -touch /user/satya/test/foo
hadoop fs -rmr /user/satya/test/foo
hadoop fs -touchz /user/satya/test/bar
hadoop fs -count -q /user/satya

hadoop -execute

hadoop job -list
hadoop job -kill jobID
hadoop job -list-attempt-ids jobID taskType taskState
hadoop job -kill-task taskAttemptId

hadoop namenode -format

hadoop jar <jar_file> wordcount <output_file>
hadoop jar /opt/hadoop/hadoop-examples-1.0.4.jar wordcount /out/wc_output

hadoop dfsadmin -report
hadoop dfsadmin -setSpaceQuota 10737418240 /user/esammer
hadoop dfsadmin -refreshNodes
hadoop dfsadmin -upgradeProgress status
hadoop dfsadmin -finalizeUpgrade

hadoop fsck
Usage: DFSck <path> [-move | -delete | -openforwrite] [-files [-blocks [-locations | -racks]]]
 <path>  -- start checking from this path
 -move  -- move corrupted files to /lost+found
 -delete  -- delete corrupted files
 -files   -- print out files being checked
 -openforwrite -- print out files opened for write
 -blocks  -- print out block report
 -locations  -- print out locations for every block
 -racks  -- print out network topology for data-node locations
By default fsck ignores files opened for write, use -openforwrite to report such files. They are usually tagged CORRUPT or HEALTHY depending on their block allocation status.
hadoop fsck / -files -blocks -locations
hadoop fsck /user/satya -files -blocks -locations

hadoop distcp  -- Distributed Copy (distcp)
distcp [OPTIONS] <srcurl>* <desturl>
 -p[rbugp] Preserve status
 r: replication number
 b: block size
 u: user
 g: group
 p: permission
 -p alone is equivalent to -prbugp
 -i Ignore failures
 -log <logdir> Write logs to <logdir>
 -m <num_maps> Maximum number of simultaneous copies
 -overwrite Overwrite destination
 -update Overwrite if src size different from dst size
 -skipcrccheck Do not use CRC check to determine if src is different from dest. Relevant only if -update is specified
 -f <urilist_uri> Use list at <urilist_uri> as src list
 -filelimit <n> Limit the total number of files to be <= n
 -sizelimit <n> Limit the total size to be <= n bytes
 -delete Delete the files existing in the dst but not in src
 -mapredSslConf <f> Filename of SSL configuration for mapper task

NOTE 1: if -overwrite or -update are set, each source URI is interpreted as an isomorphic update to an existingdirectory.
For example:
hadoop distcp -p -update "hdfs://A:8020/user/foo/bar" "hdfs://B:8020/user/foo/baz"
would update all descendants of 'baz' also in 'bar'; it would *not* update /user/foo/baz/bar

NOTE 2: The parameter <n> in -filelimit and -sizelimit can be specified with symbolic representation. For examples,
1230k = 1230 * 1024 = 1259520
891g = 891 * 1024^3 = 956703965184

hadoop distcp hdfs://A:8020/path/one hdfs://B:8020/path/two
hadoop distcp /path/one /path/two

1 comment:

  1. wonderful information, I had come to know about your blog from my friend nandu , hyderabad,i have read atleast 7 posts of yours by now, and let me tell you, your website gives the best and the most interesting information. This is just the kind of information that i had been looking for, i'm already your rss reader now and i would regularly watch out for the new posts, once again hats off to you! Thanks a ton once again, Regards,
    Hadoop Training in hyderabad