Hbase访问方式之Mapreduce
来源:互联网 发布:java 数据类型 double 编辑:程序博客网 时间:2024/05/22 05:25
概述:
Hbase对Mapreduce API进行了扩展,方便Mapreduce任务读写HTable数据。
一个简单示例:
说明:从日志表中,统计每个IP访问网站目录的总数
package man.ludq.hbase;import java.io.IOException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.hbase.HBaseConfiguration;import org.apache.hadoop.hbase.client.Put;import org.apache.hadoop.hbase.client.Result;import org.apache.hadoop.hbase.client.Scan;import org.apache.hadoop.hbase.io.ImmutableBytesWritable;import org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil;import org.apache.hadoop.hbase.mapreduce.TableMapper;import org.apache.hadoop.hbase.mapreduce.TableReducer;import org.apache.hadoop.hbase.util.Bytes;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;public class ExampleTotalMapReduce{public static void main(String[] args) {try{Configuration config = HBaseConfiguration.create();Job job = new Job(config,"ExampleSummary");job.setJarByClass(ExampleTotalMapReduce.class); // class that contains mapper and reducerScan scan = new Scan();scan.setCaching(500); // 1 is the default in Scan, which will be bad for MapReduce jobsscan.setCacheBlocks(false); // don't set to true for MR jobs// set other scan attrs//scan.addColumn(family, qualifier);TableMapReduceUtil.initTableMapperJob("access-log", // input tablescan, // Scan instance to control CF and attribute selectionMyMapper.class, // mapper classText.class, // mapper output keyIntWritable.class, // mapper output valuejob);TableMapReduceUtil.initTableReducerJob("total-access", // output tableMyTableReducer.class, // reducer classjob);job.setNumReduceTasks(1); // at least one, adjust as requiredboolean b = job.waitForCompletion(true);if (!b) {throw new IOException("error with job!");} } catch(Exception e){e.printStackTrace();}}public static class MyMapper extends TableMapper<Text, IntWritable> {private final IntWritable ONE = new IntWritable(1);private Text text = new Text();public void map(ImmutableBytesWritable row, Result value, Context context) throws IOException, InterruptedException {String ip = Bytes.toString(row.get()).split("-")[0];String url = new String(value.getValue(Bytes.toBytes("info"), Bytes.toBytes("url")));text.set(ip+"&"+url);context.write(text, ONE);}}public static class MyTableReducer extends TableReducer<Text, IntWritable, ImmutableBytesWritable> {public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {int sum = 0;for (IntWritable val : values) {sum += val.get();}Put put = new Put(key.getBytes());put.add(Bytes.toBytes("info"), Bytes.toBytes("count"), Bytes.toBytes(String.valueOf(sum)));context.write(null, put);}}}
0 0
- Hbase访问方式之Mapreduce
- Hbase访问方式之Mapreduce
- Hbase访问方式之Mapreduce
- (转)Hbase访问方式之Mapreduce
- Hbase访问方式之Hbase shell
- Hbase访问方式之Hbase shell
- Hbase访问方式之Hbase shell
- Hbase访问方式之Java API
- Hbase访问方式之Java API
- HBase之MapReduce
- Hbase的访问方式
- HBase导入大数据三大方式之(三)——mapreduce+completebulkload 方式
- HBase整合MapReduce之建立HBase索引
- Hbase编程入门之MapReduce
- Hbase编程入门之MapReduce
- hadoop学习笔记之mapreduce 基于hbase日志数据的最频繁访问ip统计
- 【HBase基础教程】6、HBase之读取MapReduce数据写入HBase
- HBase 之访问控制
- mysql的远程数据库连接
- Apache 关于 mod_rewrite 遇到 %2F或%5C (正反斜杠)等特殊符号导致URL重写失效出现404的问题
- 如何实现全选按钮和多选按钮
- inet_pton()函数,编译运行提示段错误
- C/ C++字符串的各种转换
- Hbase访问方式之Mapreduce
- INIT_WORK和INIT_DELAYED_WORK详解
- Mac Android Studio 快捷键
- 黑马程序员——Java集合
- python super
- 没有什么能够阻挡
- Mysql&MariaDB用trigger调用shell
- 关于Struts的注释
- 中企万商网店加盟