(转)Hbase访问方式之Mapreduce
来源:互联网 发布:理解信息与数据 编辑:程序博客网 时间:2024/06/05 15:43
概述:
Hbase对Mapreduce API进行了扩展,方便Mapreduce任务读写HTable数据。
一个简单示例:
说明:从日志表中,统计每个IP访问网站目录的总数
- package man.ludq.hbase;
- import java.io.IOException;
- import org.apache.hadoop.conf.Configuration;
- import org.apache.hadoop.hbase.HBaseConfiguration;
- import org.apache.hadoop.hbase.client.Put;
- import org.apache.hadoop.hbase.client.Result;
- import org.apache.hadoop.hbase.client.Scan;
- import org.apache.hadoop.hbase.io.ImmutableBytesWritable;
- import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;
- import org.apache.hadoop.hbase.mapreduce.TableMapper;
- import org.apache.hadoop.hbase.mapreduce.TableReducer;
- import org.apache.hadoop.hbase.util.Bytes;
- import org.apache.hadoop.io.IntWritable;
- import org.apache.hadoop.io.Text;
- import org.apache.hadoop.mapreduce.Job;
- public class ExampleTotalMapReduce{
- public static void main(String[] args) {
- try{
- Configuration config = HBaseConfiguration.create();
- Job job = new Job(config,"ExampleSummary");
- job.setJarByClass(ExampleTotalMapReduce.class); // class that contains mapper and reducer
- Scan scan = new Scan();
- scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobs
- scan.setCacheBlocks(false); // don't set to true for MR jobs
- // set other scan attrs
- //scan.addColumn(family, qualifier);
- TableMapReduceUtil.initTableMapperJob(
- "access-log", // input table
- scan, // Scan instance to control CF and attribute selection
- MyMapper.class, // mapper class
- Text.class, // mapper output key
- IntWritable.class, // mapper output value
- job);
- TableMapReduceUtil.initTableReducerJob(
- "total-access", // output table
- MyTableReducer.class, // reducer class
- job);
- job.setNumReduceTasks(1); // at least one, adjust as required
- boolean b = job.waitForCompletion(true);
- if (!b) {
- throw new IOException("error with job!");
- }
- } catch(Exception e){
- e.printStackTrace();
- }
- }
- public static class MyMapper extends TableMapper<Text, IntWritable> {
- private final IntWritable ONE = new IntWritable(1);
- private Text text = new Text();
- public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {
- String ip = Bytes.toString(row.get()).split("-")[0];
- String url = new String(value.getValue(Bytes.toBytes("info"), Bytes.toBytes("url")));
- text.set(ip+"&"+url);
- context.write(text, ONE);
- }
- }
- public static class MyTableReducer extends TableReducer<Text, IntWritable, ImmutableBytesWritable> {
- public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
- int sum = 0;
- for (IntWritable val : values) {
- sum += val.get();
- }
- Put put = new Put(key.getBytes());
- put.add(Bytes.toBytes("info"), Bytes.toBytes("count"), Bytes.toBytes(String.valueOf(sum)));
- context.write(null, put);
- }
- }
- }
参考文档:
1、Mapreduce操作Hbase(官方文档,包括 读/读写/多表输出/输出到文件/输出到RDBMS/Job中访问其他的HBase Tables)
http://abloz.com/hbase/book.html#mapreduce.example
0 0
- (转)Hbase访问方式之Mapreduce
- Hbase访问方式之Mapreduce
- Hbase访问方式之Mapreduce
- Hbase访问方式之Mapreduce
- Hbase访问方式之Hbase shell
- Hbase访问方式之Hbase shell
- Hbase访问方式之Hbase shell
- HBase导入大数据三大方式之(三)——mapreduce+completebulkload 方式
- Hbase访问方式之Java API
- Hbase访问方式之Java API
- HBase之MapReduce
- HBase之旅四:HBase MapReduce实例分析(转自:Taobao QA Team)
- Hbase的访问方式
- (转)基于MapReduce的HBase开发
- HBase整合MapReduce之建立HBase索引
- Hbase编程入门之MapReduce
- Hbase编程入门之MapReduce
- hbase 通过mapreduce方式对hbase表的备份及恢复(export import)
- 网站架构的演变过程
- LeetCode OJ Maximum Depth of Binary Tree
- CSS3 弹性盒布局说明(CSS3 Flexible Box Layout Explained)
- LeetCode OJ Word Break
- 自己写一个简单的Web服务器(附Demo)
- (转)Hbase访问方式之Mapreduce
- 多动画集在D3D下的渲染
- uva 10029 HASH + DP
- UVALive - 4015 Caves 树形DP
- LeetCode OJ Binary Tree Preorder Traversal
- [LeetCode 48]Rotate Image
- 第二周项目1旱冰场造价
- LeetCode OJ Binary Tree Postorder Traversal
- Android自定义带有阴影效果的按钮Demo