Hadoop中-put和-copyFromLocal的区别
来源:互联网 发布:直播软件 编辑:程序博客网 时间:2024/05/17 18:46
如下中的stackoverflow的链接。
简单的说,-put更宽松,可以把本地或者HDFS上的文件拷贝到HDFS中;而-copyFromLocal则更严格限制只能拷贝本地文件到HDFS中。
???
PS:“ put would prefer the HDFS scheme instead of the local file system”,也就是说,如果本地和HDFS上都存在相同路径,则-put跟趋于优先取HDFS的源。
但是我测试了:
hadoop fs -put hdfs:///tmp/hive-XXX/test.txt /user/XXX/test.txt.hdfs
hadoop fs -put /tmp/hive-XXX/test.txt /user/XXX/test.txt.local
hadoop fs -cat /user/XXX/test.txt.*
local path:/tmp/hive-XXX
local path:/tmp/hive-XXX
所以。。。。
链接:http://stackoverflow.com/questions/7811284/difference-between-hadoop-fs-put-and-hadoop-fs-copyfromlocal
——————————————————————————————————————————————
Difference between hadoop fs -put and hadoop fs -copyFromLocal
-put
and -copyFromLocal
are documented as identical, while most examples use the verbose variant -copyFromLocal. Why?
Same thing for -get
and -copyToLocal
2 Answers
- copyFromLocal is similar to put command, except that the source is restricted to a local file reference.
So, basically you can do with put, all that you do with copyFromLocal, but not vice-versa.
Similarly,
- copyToLocal is similar to get command, except that the destination is restricted to a local filereference.
Hence, you can use get instead of copyToLocal, but not the other way round.
Reference: Hadoop's documentation.
Let's make an example: If your HDFS contains the path: /tmp/dir/abc.txt
And if your local disk also contains this path then the hdfs API won't know which one you mean, unless you specify a scheme like file://
or hdfs://
. Maybe it picks the path you did not want to copy.
Therefore you have -copyFromLocal
which is preventing you from accidentally copying the wrong file, by limiting the parameter you give to the local filesystem.
Put
is for more advanced users who know which scheme to put in front.
It is always a bit confusing to new Hadoop users which filesystem they are currently in and where their files actually are.
- Hadoop中-put和-copyFromLocal的区别
- Hadoop put、copyFromLocal文件传输命令性能比较
- HTTP中Put和Post的区别
- oracle中put_line和put的区别
- http中 ,put 和 post的区别
- HTTP中PUT和POST的区别
- post和put的区别
- POST和PUT的区别
- Post和Put的区别
- HTTP协议中PUT和POST使用上的区别
- HTTP协议中PUT和POST使用上的区别
- HTTP协议中PUT和POST使用上的区别
- HTTP协议中PUT和POST使用上的区别
- HTTP协议中PUT和POST使用上的区别
- HTTP协议中PUT和POST使用上的区别
- HTTP协议中PUT和POST使用上的区别
- HTTP协议中PUT和POST使用上的区别
- Http请求中,post和put的区别
- Leetcode Problem.202—Happy Number
- Java学习 - 内存简介
- 卡耐基人性的弱点重要摘要
- wap 笔试题2015
- HDU 1671 (Trie 字典树)
- Hadoop中-put和-copyFromLocal的区别
- Leetcode Problem.203 —Remove Linked List Elements
- Android任务和返回栈完全解析,细数那些你所不知道的细节
- cygwin下运行shell脚本出现的错误
- Ehcache 整合Spring 使用页面、对象缓存
- 深入了解Struts1的运行机理
- CentOS 7安裝视频解码器
- 快速排序(小白入门专用,大神请无视)
- 收集 传感器
bin/hadoop fs -put /tmp/somepath /user/hadoop/somepath
the command actually does not know whether/tmp/somepath
exists in both filesystems, or just in local filesystem. Same thing with the destination path. – Thomas Jungblut Oct 18 '11 at 17:58put
from one HDFS to another if you'd like.-copyFromLocal
will ensure that it just picks from the local disk and uploads to HDFS. – Thomas Jungblut Oct 18 '11 at 17:58