Hadoop Archive管理大量小文件

来源:互联网 发布:邓小平网络纪念馆留言 编辑:程序博客网 时间:2024/05/18 18:43
1.usage
[hadoop@hadoop1 ~]$ hadoop archivearchive -archiveName NAME -p <parent path> <src>* <dest>
2.use case:Archive
[hadoop@hadoop1 ~]$ hadoop archive -archiveName io_data.har -p /benchmarks/TestDFSIO/io_data/*  /
3.use case:view the archive
[hadoop@hadoop1 ~]$ hadoop fs -ls -R /benchmarks/io_data.har-rw-r--r--   3 hadoop supergroup          0 2014-07-14 15:52 /benchmarks/io_data2.har/_SUCCESS-rw-r--r--   5 hadoop supergroup      91739 2014-07-14 15:52 /benchmarks/io_data2.har/_index-rw-r--r--   5 hadoop supergroup         60 2014-07-14 15:52 /benchmarks/io_data2.har/_masterindex-rw-r--r--   3 hadoop supergroup  102400000 2014-07-14 15:52 /benchmarks/io_data2.har/part-0
4.use case:access to a file(the har protocol is required!!!!)
[hadoop@hadoop1 ~]$ hadoop fs -ls -R har:///benchmarks/io_data2.har/test_io_998-rw-r--r--   3 hadoop supergroup     102400 2014-07-14 11:58 har:///benchmarks/io_data2.har/test_io_998
5.use case:access to har file system with api

String uri = "har://hdfs-hadoop1:9000/benchmarks/io_data2.har/test_io_998";Configuration conf = new Configuration();FileSystem fs = FileSystem.get(URI.create(uri),conf);FSDataInputStream in = null;in = fs.open(new Path(uri));IOUtils.copyBytes(in, System.out, 4096,false);IOUtils.closeStream(in);
**当时参考了一篇文章,忘了处处了,如果是您,请联系我,我把链接加上。
0 0
原创粉丝点击