hadoop系列文档2-hdfs文件系统命令诠释

来源:互联网 发布:四知文言文阅读答案 编辑:程序博客网 时间:2024/06/07 12:53

本文是针对hadoop-2.7.1版本,翻译apache官网对于hdfs文件系统的命令的描述,水平有限,有不对处,还望指正。


官方的网址:点击打开链接


总览

FS文件系统包含各种类似shell命令能直接作用于Hadoop分布式文件系统和其他一些Hadoop支持的系统类似 LocalFS, HFTP FS, S3 FS 等等。文件系统的shell这样被调用:

bin/hadoop fs<args>

所有的文件系统给的shell命令用URL地址作为参数。URL的格是:scheme://authority/path对于HDFS来说,scheme就是hdfs;对于LoaclFs来说,scheme就是file; 权限和scheme都是可以配置的,如果不制定,那么就是用配置中指定的默认方案。HDFS文件和路径类似 /parent/child 可以被指定成 hdfs://namenodehost/parent/child 或者简写成/parent/child(要在配置文件中指定目录 hdfs://namenodehost)。

在FS文件系统中的绝大数命令类似在Linux系统中相应的命令。他们之间的差别是如何描述每一个命令。错误信息被发送到stderr,打印输出信息被发送到stdout。

使用过程中,hdfs dfs是一样的。

 

正文


appendToFile

语法: hadoop fs -appendToFile <localsrc> ...<dst>

从本地文件系统中添加单行或多行代码到分布式文件系统中。也可以从标准输入中读取数据添加到分布式文件系统中。

例子:

  • hadoop fs -appendToFile localfile /user/hadoop/hadoopfile
  • hadoop fs -appendToFile localfile1 localfile2 /user/hadoop/hadoopfile
  • hadoop fs -appendToFile localfile hdfs://nn.example.com/hadoop/hadoopfile
  • hadoop fs -appendToFile - hdfs://nn.example.com/hadoop/hadoopfile从标准输入中读取文件

返回代码:成功0错误1;

 

cat

语法: hadoop fs -cat URI [URI ...]

拷贝源路径到标准输出。

例子:

  • hadoop fs -cat hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2
  • hadoop fs -cat file:///file3 /user/hadoop/file4

返回代码:成功0错误1;


checksum

语法: hadoop fs -checksum URI

返回文件的大小。

例子:

  • hadoop fs -checksum hdfs://nn1.example.com/file1
  • hadoop fs -checksum file:///etc/hosts

chgrp

语法: hadoop fs -chgrp [-R] GROUP URI [URI ...]

改变文件所属的组,用户必须是文件的拥有者, 或者超级用户。

使用-R将使改变在目录结构下递归进行

 

chmod

语法:  hadoop fs -chmod [-R] <MODE[,MODE]... | OCTALMODE>URI [URI ...]

改变文件的权限. 使用-R将使改变在目录结构下递归进行

用户必须是文件的拥有者, 或者超级用户。


chown

语法: hadoop fs -chown [-R] [OWNER][:[GROUP]] URI [URI ]

改变文件的拥有者。使用-R将使改变在目录结构下递归进行。命令的使用者必须是超级用户。

 

copyFromLocal

语法: hadoop fs -copyFromLocal <localsrc> URI

类似put命令, 除了限定源路径一定是本地路径之外。

使用-f命令重写已经存在的内容。


count

语法: hadoop fs -count [-q] [-h] [-v] <paths>

返回指定文件模式,路径下的目录,文件,字节大小。输出列为:DIR_COUNT, FILE_COUNT, CONTENT_SIZE, PATHNAME

使用 –count –q 命令返回的是QUOTA, REMAINING_QUATA, SPACE_QUOTA, REMAINING_SPACE_QUOTA,DIR_COUNT, FILE_COUNT, CONTENT_SIZE, PATHNAME

The -hoption shows sizes in human readable format.

The -voption displays a header line. 

例子:

  • hadoop fs -count hdfs://nn1.example.com/file1 hdfs://nn2.example.com/file2
  • hadoop fs -count -q hdfs://nn1.example.com/file1
  • hadoop fs -count -q -h hdfs://nn1.example.com/file1
  • hdfs dfs -count -q -h -v hdfs://nn1.example.com/file1

成功返回0,错误返回1。


cp

Usage: hadoop fs -cp [-f] [-p | -p[topax]] URI [URI ...]<dest>

从源地址复制到目的地址。 命令允许多个源地址,但目的地址必须是一个路径。

 ‘raw.*’命名空间所继承的数据是会被保留的,如何源地址和目的地址支持的的话(仅HDFS)。

并且所有的原地址和目的地址的路径名字在/.reserved/raw hierarchy。确定‘raw.*命名空间xattrs是否被保留,是不受-p 标签影响。

Options:

  • -f 会重写目的地址已经存在的。
  • -p保留文件的属性(imestamps, ownership, permission,ACL, XAttr)。如果-p没有给定参数,那么保留timestamps, ownership, permission。如果制定-pa,那么保留许可,也因为ACL是一个超级许可。使用-p是不会影响raw命名空间继承属性是否被保留。

Example:

  • hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2
  • hadoop fs -cp /user/hadoop/file1 /user/hadoop/file2 /user/hadoop/dir

成功返回0,错误返回1。

 

 

createSnapshot

See HDFSSnapshots Guide.

 

deleteSnapshot

See HDFSSnapshots Guide.

 

df

Usage: hadoop fs -df [-h] URI [URI ...]

显示剩余空间。

使用-h命令格式化显示的大小。

Example:

  • hadoop dfs -df /user/hadoop/dir1

 

du

Usage: hadoop fs -du [-s] [-h] URI [URI ...]

显示给定目录下所有包含的文件和路径的大小,一定是文件才能显示大小。

Options:

  • The -s option will result in an aggregate summary of file lengths being displayed, rather than the individual files.

-s命令显示的是文件的总大小而不是单个文件的大小。

  • The -h option will format file sizes in a “human-readable” fashion (e.g 64.0m instead of 67108864)

Example:

  • hadoop fs -du /user/hadoop/dir1 /user/hadoop/file1 hdfs://nn.example.com/user/hadoop/dir1

成功返回0错误返回1。

 

dus

Usage: hadoop fs -dus <args>

显示文件概要

注意:命令被启用,使用 hadoop fs -du –s进行替代。

 

expunge

Usage: hadoop fs -expunge

清空回收站。请参考HDFS设计文档以获取更多关于回收站特性的信息。

 

find

Usage: hadoop fs -find <path> ... <expression> ...

Findsall files that match the specified expression and applies selected actions tothem. If no path is specified then defaults to thecurrent working directory. If no expression is specified then defaults to-print.

找到所有符合指定表达式的文件并且支持选择的动作。如何没有指定的路径那么默认返回但前工作路径。

如果没有指定表达式那么默认返回 – print

Thefollowing primary expressions are recognised:

·        -name pattern
-iname pattern

Evaluates as true if the basename of the file matches the patternusing standard file system globbing. If -iname is used then the match is case insensitive.

·        -print
-print0Always

evaluates to true. Causes the current pathname to be written tostandard output. If the -print0 expression is used then an ASCII NULL characteris appended.

Thefollowing operators are recognised:

·        expression -a expression
expression -and expression
expression expression

逻辑与操作给两个表达式,两个子表达式全部返回真 那么返回真。隐含并列的两个表达式,所以不需要显性的显示出来。第二个表达式不会被执行如何第一个失败的话。

Example:

hadoop fs -find / -name test –print

成功返回0错误返回1。

 

get

Usage: hadoop fs -get [-ignorecrc] [-crc] <src><localdst>

Copy filesto the local file system. Files that fail the CRC check may be copied with the-ignorecrc option. Files and CRCs may be copied using the -crc option.

复制文件到本地文件系统。使用-ignorecrc 选项来复制CRC校检失败的文件。使用-crc选项来复制文件和CRCS。

Example:

  • hadoop fs -get /user/hadoop/file localfile
  • hadoop fs -get hdfs://nn.example.com/user/hadoop/file localfile

成功返回0,错误返回1。


getfattr

Usage: hadoop fs -getfattr [-R] -n name | -d [-e en] <path>

Displaysthe extended attribute names and values (if any) for a file or directory.

Options:

  • -R: Recursively list the attributes for all files and directories.
  • -n name: Dump the named extended attribute value.
  • -d: Dump all extended attribute values associated with pathname.
  • -e encoding: Encode values after retrieving them. Valid encodings are “text”, “hex”, and “base64”. Values encoded as text strings are enclosed in double quotes ("), and values encoded as hexadecimal and base64 are prefixed with 0x and 0s, respectively.
  • path: The file or directory.

递归列出所有文件和目录的属性。

Examples:

  • hadoop fs -getfattr -d /file
  • hadoop fs -getfattr -R -n user.myAttr /dir

成功返回0错误返货1。

 

getmerge

Usage: hadoop fs -getmerge <src> <localdst> [addnl]

Takes asource directory and a destination file as input and concatenates files in srcinto the destination local file. Optionally addnl can be set to enable adding anewline character at the end of each file.

接受一个源目录和一个目标文件作为输入,并且将源目录中所有的文件连接成本地目标文件。addnl是可选的,用于指定在每个文件结尾添加一个换行符。

 

 

 

help

Usage: hadoop fs -help

Returnusage output.

 

 

 

ls

Usage: hadoop fs -ls [-d] [-h] [-R] [-t] [-S] [-r] [-u]<args>

Options:

  • -d: Directories are listed as plain files.
  • -h: Format file sizes in a human-readable fashion (eg 64.0m instead of 67108864).
  • -R: Recursively list subdirectories encountered.
  • -t: Sort output by modification time (most recent first).
  • -S: Sort output by file size.
  • -r: Reverse the sort order.
  • -u: Use access time rather than modification time for display and sorting.

For afile ls returns stat on the file with the following format:

如果是文件,则按照如下格式返回文件信息:

permissions number_of_replicas userid groupid filesize modification_date modification_time filename

For adirectory it returns list of its direct children as in Unix. A directory islisted as:

如果是目录,则按照如下格式返回文件信息:

permissions userid groupid modification_date modification_time dirname

Fileswithin a directory are order by filename by default.

Example:

  • hadoop fs -ls /user/hadoop/file1

成功返回0错误返回1。

 

 

 

 

 

lsr

Usage: hadoop fs -lsr <args>

Recursiveversion of ls.

Ls的递归版本。

Note: Thiscommand is deprecated. Instead use hadoop fs -ls –R

命令已经被弃用,用hadoop fs -ls –R代替。

 

 

 

mkdir

Usage: hadoop fs -mkdir [-p] <paths>

Takespath uri’s as argument and creates directories.

接受路径指定的uri作为参数,创建这些目录。

Options:

  • The -p option behavior is much like Unix mkdir -p, creating parent directories along the path。
  • 其行为类似于Unixmkdir -p,它会创建路径中的各级父目录。

Example:

  • hadoop fs -mkdir /user/hadoop/dir1 /user/hadoop/dir2
  • hadoop fs -mkdir hdfs://nn1.example.com/user/hadoop/dir hdfs://nn2.example.com/user/hadoop/dir

成功返回0错误返回1。

 

 

moveFromLocal

Usage: hadoop fs -moveFromLocal <localsrc> <dst>

Similarto put command, except that the source localsrc is deleted after it’s copied.

类似put命令,不同的是源本地文件在复制后被删除。

 

moveToLocal

Usage: hadoop fs -moveToLocal [-crc] <src> <dst>

Displaysa “Not implemented yet” message.

显示 “未实现”的信息。

 

 

mv

Usage: hadoop fs -mv URI [URI ...] <dest>

Movesfiles from source to destination. This command allows multiple sources as wellin which case the destination needs to be a directory. Moving files across filesystems is not permitted.

将文件从源路径移动到目标路径。这个命令允许有多个源路径,此时目标路径必须是一个目录。不允许在不同的文件系统间移动文件。 

Example:

  • hadoop fs -mv /user/hadoop/file1 /user/hadoop/file2
  • hadoop fs -mv hdfs://nn.example.com/file1 hdfs://nn.example.com/file2 hdfs://nn.example.com/file3 hdfs://nn.example.com/dir1

成功返回0错误返回1。

 

put

Usage: hadoop fs -put <localsrc> ... <dst>

Copysingle src, or multiple srcs from local file system to the destination filesystem. Also reads input from stdin and writes to destination file system.

从本地文件系统中复制单个或多个源路径到目标文件系统。也支持从标准输入中读取输入写入目标文件系统。

  • hadoop fs -put localfile /user/hadoop/hadoopfile
  • hadoop fs -put localfile1 localfile2 /user/hadoop/hadoopdir
  • hadoop fs -put localfile hdfs://nn.example.com/hadoop/hadoopfile
  • hadoop fs -put - hdfs://nn.example.com/hadoop/hadoopfile Reads the input from stdin.

成功返回0错误返回1。

 

 

renameSnapshot

See HDFSSnapshots Guide.

 

rm

Usage: hadoop fs -rm [-f] [-r |-R] [-skipTrash] URI [URI ...]

Deletefiles specified as args.

删除指定的文件。

Options:

  • The -f option will not display a diagnostic message or modify the exit status to reflect an error if the file does not exist.
  • The -R option deletes the directory and any content under it recursively.
  • The -r option is equivalent to -R.
  • The -skipTrash option will bypass trash, if enabled, and delete the specified file(s) immediately. This can be useful when it is necessary to delete files from an over-quota directory.
  • -skipTrash选项将绕过垃圾,如果启用,并立即删除指定的文件(s)。这可以有用必要时从一个超限额目录删除文件。

Example:

  • hadoop fs -rm hdfs://nn.example.com/file /user/hadoop/emptydir

ExitCode:

成功返回0错误返货1。

 

 

rmdir

Usage: hadoop fs -rmdir [--ignore-fail-on-non-empty] URI [URI ...]

Delete adirectory.

删除目录。

Options:

  • --ignore-fail-on-non-empty: When using wildcards, do not fail if a directory still contains files.

用了和这个通配符,如何这个目录里面包含文件也不会失败。

Example:

  • hadoop fs -rmdir /user/hadoop/emptydir

 

 

 

 

rmr

Usage: hadoop fs -rmr [-skipTrash] URI [URI ...]

Recursiveversion of delete.

Note: Thiscommand is deprecated. Instead use hadoop fs -rm –r

使用hadoop fs -rm –r来代替。

 

 

 

setfacl

Usage: hadoop fs -setfacl [-R] [-b |-k -m |-x <acl_spec><path>] |[--set <acl_spec> <path>]

SetsAccess Control Lists (ACLs) of files and directories.

设置文件和目录的标准ACL。

Options:

  • -b: Remove all but the base ACL entries. The entries for user, group and others are retained for compatibility with permission bits.
  • -k: Remove the default ACL.
  • -R: Apply operations to all files and directories recursively.
  • -m: Modify ACL. New entries are added to the ACL, and existing entries are retained.
  • -x: Remove specified ACL entries. Other ACL entries are retained.
  • --set: Fully replace the ACL, discarding all existing entries. The acl_spec must include entries for user, group, and others for compatibility with permission bits.
  • acl_spec: Comma separated list of ACL entries.
  • path: File or directory to modify.

用户权限。

Examples:

  • hadoop fs -setfacl -m user:hadoop:rw- /file
  • hadoop fs -setfacl -x user:hadoop /file
  • hadoop fs -setfacl -b /file
  • hadoop fs -setfacl -k /dir
  • hadoop fs -setfacl --set user::rw-,user:hadoop:rw-,group::r--,other::r-- /file
  • hadoop fs -setfacl -R -m user:hadoop:r-x /dir
  • hadoop fs -setfacl -m default:user:hadoop:r-x /dir

成功返回0错误返回1。

 

setfattr

Usage: hadoop fs -setfattr -n name [-v value] | -x name<path>

Sets anextended attribute name and value for a file or directory.

为文件或目录设置一个额外的属性和属性值

Options:

  • -b: Remove all but the base ACL entries. The entries for user, group and others are retained for compatibility with permission bits.
  • -n name: The extended attribute name.
  • -v value: The extended attribute value. There are three different encoding methods for the value. If the argument is enclosed in double quotes, then the value is the string inside the quotes. If the argument is prefixed with 0x or 0X, then it is taken as a hexadecimal number. If the argument begins with 0s or 0S, then it is taken as a base64 encoding.
  • -x name: Remove the extended attribute.
  • path: The file or directory.

Examples:

  • hadoop fs -setfattr -n user.myAttr -v myValue /file
  • hadoop fs -setfattr -n user.noValue /file
  • hadoop fs -setfattr -x user.myAttr /file

成功返货0错误返货1。

 

 

 

setrep

Usage: hadoop fs -setrep [-R] [-w] <numReplicas><path>

Changesthe replication factor of a file. If path is a directory then the commandrecursively changes the replication factor of all files under the directorytree rooted at path.

 

更改文件的复制因子,如果是个目录,那么这个命令就会改变这个路径下的所有文件。

Options:

  • The -w flag requests that the command wait for the replication to complete. This can potentially take a very long time.
  • The -R flag is accepted for backwards compatibility. It has no effect.

Example:

  • hadoop fs -setrep -w 3 /user/hadoop/dir1

成功返回0错误返回1。

 

stat

Usage: hadoop fs -stat [format] <path> ...

Printstatistics about the file/directory at <path> in the specified format.Format accepts filesize in blocks (%b), type (%F), group name of owner (%g),name (%n), block size (%o), replication (%r), user name of owner(%u), andmodification date (%y, %Y). %y shows UTC date as “yyyy-MM-dd HH:mm:ss” and %Yshows milliseconds since January 1, 1970 UTC. If the format is not specified,%y is used by default.

Example:

  • hadoop fs -stat "%F %u:%g %b %y %n" /file

成功返回0错误返回1。

tail

Usage: hadoop fs -tail [-f] URI

Displayslast kilobyte of the file to stdout.

将文件尾部1K字节的内容输出到stdout。

Options:

  • The -f option will output appended data as the file grows, as in Unix.
  • 支持-f选项,行为和Unix中一致。

Example:

  • hadoop fs -tail pathname

成功返回0错误返回1。

 

 

test

Usage: hadoop fs -test -[defsz] URI

Options:

  • -d: f the path is a directory, return 0.
  • -e: if the path exists, return 0.
  • -f: if the path is a file, return 0.
  • -s: if the path is not empty, return 0.
  • -z: if the file is zero length, return 0.

Example:

  • hadoop fs -test -e filename

 

 

 

 

 

text

Usage: hadoop fs -text <src>

Takes asource file and outputs the file in text format. The allowed formats are zipand TextRecordInputStream.

将源文件输出为文本格式。允许的格式是zip和TextRecordInputStream。

 

 

touchz

Usage: hadoop fs -touchz URI [URI ...]

Create afile of zero length.

创建一个0字节的空文件。

Example:

  • hadoop fs -touchz pathname

成功返回0错误返回1。

 

truncate

Usage: hadoop fs -truncate [-w] <length> <paths>

Truncateall files that match the specified file pattern to the specified length.

Options:

  • The -w flag requests that the command waits for block recovery to complete, if necessary. Without -w flag the file may remain unclosed for some time while the recovery is in progress. During this time file cannot be reopened for append.

Example:

  • hadoop fs -truncate 55 /user/hadoop/file1 /user/hadoop/file2
  • hadoop fs -truncate -w 127 hdfs://nn1.example.com/user/hadoop/file1

 

 

usage

Usage: hadoop fs -usage command

Returnthe help for an individual command.

 

返回单个命令的帮助。

 

 

 

0 0