HFileInputFormat实现
来源:互联网 发布:数据库分布式 编辑:程序博客网 时间:2024/06/05 00:41
hbase的底层存储采用的是hfile文件格式,可以作为mr的输入,进行hfile的mr。代码如下:
import java.io.IOException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.FileSystem;import org.apache.hadoop.fs.Path;import org.apache.hadoop.hbase.KeyValue;import org.apache.hadoop.hbase.io.ImmutableBytesWritable;import org.apache.hadoop.hbase.io.hfile.CacheConfig;import org.apache.hadoop.hbase.io.hfile.HFile;import org.apache.hadoop.hbase.io.hfile.HFileScanner;import org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics;import org.apache.hadoop.mapreduce.InputSplit;import org.apache.hadoop.mapreduce.JobContext;import org.apache.hadoop.mapreduce.RecordReader;import org.apache.hadoop.mapreduce.TaskAttemptContext;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.input.FileSplit;/** * This is direct port (hopefully) of the Scala version of this class available * on https://gist.github.com/1120311 * * @author yuankang */public class HFileInputFormat extends FileInputFormat<ImmutableBytesWritable, KeyValue> {private class HFileRecordReader extendsRecordReader<ImmutableBytesWritable, KeyValue> {private HFile.Reader reader;private final HFileScanner scanner;private int entryNumber = 0;public HFileRecordReader(FileSplit split, Configuration conf)throws IOException {SchemaMetrics.configureGlobally(conf);final Path path = split.getPath();reader = HFile.createReader(FileSystem.get(conf), path, new CacheConfig(conf));scanner = reader.getScanner(false, false);reader.loadFileInfo(); // This is required or else seekTo throws a// NPEscanner.seekTo(); // This is required or else scanner.next throws an// error}@Overridepublic void close() throws IOException {if (reader != null) {reader.close();}}/* * @Override public boolean next(ImmutableBytesWritable key, KeyValue * value) throws IOException { entryNumber++; return scanner.next(); } */@Overridepublic ImmutableBytesWritable getCurrentKey() throws IOException,InterruptedException {// TODO Auto-generated method stubreturn new ImmutableBytesWritable(scanner.getKeyValue().getRow());}@Overridepublic KeyValue getCurrentValue() throws IOException,InterruptedException {// TODO Auto-generated method stubreturn scanner.getKeyValue();}@Overridepublic boolean nextKeyValue() throws IOException, InterruptedException {// TODO Auto-generated method stubentryNumber++;return scanner.next();}@Overridepublic float getProgress() throws IOException, InterruptedException {// TODO Auto-generated method stubif (reader != null) { return (entryNumber / reader.getEntries()); } return 1;}@Overridepublic void initialize(InputSplit arg0, TaskAttemptContext arg1)throws IOException, InterruptedException {}}@Overrideprotected boolean isSplitable(JobContext context, Path filename) { return false;}@Overridepublic RecordReader<ImmutableBytesWritable, KeyValue> createRecordReader(InputSplit split,TaskAttemptContext context) throws IOException,InterruptedException {return new HFileRecordReader((FileSplit) split,context.getConfiguration());}}
- HFileInputFormat实现
- 快速全量检索Hbase的核武器---------HfileInputFormat
- 实现
- 实现
- 红黑树实现 实现代码
- java实现排列组合实现
- 实现Runnable 实现线程
- 实现ViewPager多种实现
- 双向LSTM实现实现
- 实现缓存 java实现
- 三子棋的实现的实现的实现
- 四则运算实现
- 继承实现
- 重载实现
- 实现缩略图
- split实现
- 实现缩略图
- wmi实现
- 细说UI线程和Windows消息队列
- webBrowser控件实现网页时实现进度显示
- Jquery操作Cookie
- Android布局文件中的各种属性
- 任务计划运行失败,显示“登录失败”
- HFileInputFormat实现
- Qt学习之路(43): QDirModel
- Extjs GridPanel使用攻略简单教程
- 向量表解决约瑟夫问题
- JS操作COOKIE
- shell学习1
- linux命令之cat
- fsockopen与pfsockopen的区别
- Flash Player 11 Stage3D学习大杂烩