Hadoop文件系统之上传下载文件
来源:互联网 发布:linux 视频点播服务器 编辑:程序博客网 时间:2024/06/05 06:58
如何将本地文件上传到hadoop,又如何将hadoop文件下载到本地。借助于java.net.URL的InputStream读取HDFS的地址,在Hadoop集群中,要使用HDFS自己的文件系统FileSystem,必须生成一定的配置,有以下几种方法生成:
public static FileSystem get(Configuration conf) throws IOExceptionpublic static FileSystem get(URI uri, Configuration conf) throws IOExceptionpublic static FileSystem get(URI uri, Configuration conf, String user)throws IOException一个Configuration实体,是利用hadoop安装目录下的etc/hadoop/core-site.xml生成的。
那么实现类似于Linux文件系统里的cat功能的代码如下:
import java.io.BufferedInputStream;import java.io.FileInputStream;import java.io.FileNotFoundException;import java.io.FileOutputStream;import java.io.IOException;import java.io.InputStream;import java.io.OutputStream;import java.net.URI;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.FSDataInputStream;import org.apache.hadoop.fs.FileSystem;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IOUtils;import org.apache.hadoop.util.Progressable;public class FileSystemCat { public static void main(String[] args) throws Exception { String uri = args[0]; Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(URI.create(uri), conf); InputStream in = null; try { in = fs.open(new Path(uri)); IOUtils.copyBytes(in, System.out, 4096, false); }finally { IOUtils.closeStream(in); } }}打印出HDFS上一个文件夹的内容:
import java.io.BufferedInputStream;import java.io.FileInputStream;import java.io.FileNotFoundException;import java.io.FileOutputStream;import java.io.IOException;import java.io.InputStream;import java.io.OutputStream;import java.net.URI;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.FSDataInputStream;import org.apache.hadoop.fs.FileSystem;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IOUtils;import org.apache.hadoop.util.Progressable;public class ListStatus { public static void main(String[] args) throws Exception { String uri = args[0]; Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(URI.create(uri), conf); Path[] paths = new Path[args.length]; for (int i = 0; i < paths.length; i++) { paths[i] = new Path(args[i]); } FileStatus[] status = fs.listStatus(paths); Path[] listedPaths = FileUtil.stat2Paths(status); for(Path p : listedPaths) { System.out.println(p); } }}HDFS使用通配符的方法:
public FileStatus[] globStatus(Path pathPattern) throws IOExceptionpublic FileStatus[] globStatus(Path pathPattern, PathFilter filter)throws IOExceptionhadoop支持的通配符(有些东西就不翻译了,大家大体可以看懂啥意思):
比如集群上有两个文件夹结构如下:
匹配时候,匹配结果如下,左边为通配符,右边选项为结果:
Hadoop删除HDFS数据函数:
public boolean delete(Path f, boolean recursive) throws IOException当f为一个文件或者一个空目录的时候,recursive的值为假,当删除一个非空目录的时候,recursive为真。
最后看一下上传HDFS与下载到HDFS实现的代码:
import java.io.BufferedInputStream;import java.io.FileInputStream;import java.io.FileNotFoundException;import java.io.FileOutputStream;import java.io.IOException;import java.io.InputStream;import java.io.OutputStream;import java.net.URI;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.FSDataInputStream;import org.apache.hadoop.fs.FileSystem;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IOUtils;import org.apache.hadoop.util.Progressable;public class UploadAndDown { public static void main(String[] args) { UploadAndDown uploadAndDown = new UploadAndDown(); try { //将本地文件local.txt上传为HDFS上cloud.txt文件 uploadAndDown.upLoadToCloud("local.txt", "cloud.txt"); //将HDFS上的cloud.txt文件下载到本地cloudTolocal.txt文件 uploadAndDown.downFromCloud("cloudTolocal.txt", "cloud.txt"); } catch (FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } private void upLoadToCloud(String srcFileName, String cloudFileName) throws FileNotFoundException, IOException { String LOCAL_SRC = "/home/sina/hbase2/bin/" + srcFileName; String CLOUD_DEST = "hdfs://localhost:9000/user/hadoop/" + cloudFileName; InputStream in = new BufferedInputStream(new FileInputStream(LOCAL_SRC)); Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(URI.create(CLOUD_DEST), conf); OutputStream out = fs.create(new Path(CLOUD_DEST), new Progressable() { @Override public void progress() { System.out.println("upload a file to HDFS"); } }); IOUtils.copyBytes(in, out, 1024, true); } private void downFromCloud(String srcFileName, String cloudFileName) throws FileNotFoundException, IOException { String CLOUD_DESC = "hdfs://localhost:9000/user/hadoop/"+cloudFileName; String LOCAL_SRC = "/home/hadoop/datasrc/"+srcFileName; Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(URI.create(CLOUD_DESC), conf); FSDataInputStream HDFS_IN = fs.open(new Path(CLOUD_DESC)); OutputStream OutToLOCAL = new FileOutputStream(LOCAL_SRC); IOUtils.copyBytes(HDFS_IN, OutToLOCAL, 1024, true); }}
0 0
- Hadoop文件系统之上传下载文件
- hadoop的FileSystem 文件系统实现上传下载文件
- hadoop的FileSystem 文件系统实现上传下载文件
- hadoop hdfs 上传下载文件
- hadoop Hdfs文件上传下载
- hadoop学习笔记之hdfs的文件上传下载
- Hadoop Shell命令(基于linux操作系统上传下载文件到hdfs文件系统基本命令学习)
- 文件上传下载之FileUpload
- fastDFS分布式文件系统与文件上传下载
- Hadoop之文件系统Shell
- Hadoop之文件系统Shell
- Hadoop HDFS文件系统通过java FileSystem 实现上传下载等
- 文件上传下载之文件上传
- Hadoop之HDFS原理及文件上传下载源码分析(下)
- Hadoop之HDFS原理及文件上传下载源码分析(下)
- .NET之FTP文件上传下载
- javaee之文件上传下载练习
- SFTP之上传下载<单个>文件
- [leetcode] Word Ladder
- Jetty 9.3庆祝20周年生日快乐,并增加HTTP/2支持
- ITMS-SERVICES://方式安装IPA在IOS 7.1中的变化
- js 函数的参数 问题 arguments对象 及闭包
- linux入门
- Hadoop文件系统之上传下载文件
- 黑马程序员——Java中IO流笔记(上)
- 2015年6月27日 课设日志
- spring mvc使用MultiActionController时发生No request handling method with name的错误
- linux之ssl连接mysql以及主从配置
- Genymation怎样设置联网
- 第3天-sql基本检索与数据过滤
- android客户端与服务器端的简单交互
- 【JavaWeb】(1)JSP基础语法