MapReduce自带wordcount的实现
来源:互联网 发布:tensorflow 物体识别 编辑:程序博客网 时间:2024/06/10 13:46
package com.bruce.mapreduce;import java.io.IOException;import java.util.StringTokenizer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.IntWritable;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class WordCount {// step 1: Map Class/** * Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT> * */public static class WordCountMapper extendsMapper<LongWritable, Text, Text, IntWritable> {private Text mapOutputKey = new Text();private final static IntWritable mapOutputValue = new IntWritable(1);@Overrideprotected void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException {// TODO Auto-generated method stub//line valueString lineValue = value.toString();//splitStringTokenizer stringTokenizer = new StringTokenizer(lineValue);//iteratorwhile(stringTokenizer.hasMoreElements()){//get valueString wordValue = stringTokenizer.nextToken();//set valuemapOutputKey.set(wordValue);//outputcontext.write(mapOutputKey, mapOutputValue);}}}// step 2: Reduce Class/** * Reducer<KEYIN, VALUEIN, KEYOUT, VALUEOUT> * */public static class WordCountReducer extendsReducer<Text, IntWritable, Text, IntWritable> {private IntWritable outputValue = new IntWritable();@Overrideprotected void reduce(Text key, Iterable<IntWritable> values,Context context) throws IOException, InterruptedException {// TODO Auto-generated method stub//sum tmpint sum = 0;//iteratorfor(IntWritable value: values){//totalsum += value.get();}//set valueoutputValue.set(sum);//outputcontext.write(key, outputValue);}}// step 3: Driver ,component jobpublic int run(String[] args) throws Exception {// 1: get configrationConfiguration configuration = new Configuration();// 2: create JobJob job = Job.getInstance(configuration, this.getClass().getSimpleName());// run jarjob.setJarByClass(this.getClass());// 3: set job// input -> map -> reduce -> output// 3.1 inputPath inPath = new Path(args[0]);FileInputFormat.addInputPath(job, inPath);// 3.2: mapjob.setMapperClass(WordCountMapper.class);job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(IntWritable.class);// 3.3: reducejob.setReducerClass(WordCountReducer.class);job.setOutputKeyClass(Text.class);job.setOutputValueClass(IntWritable.class);// 3.4: outputPath outPath = new Path(args[1]);FileOutputFormat.setOutputPath(job, outPath);// 4: submit jobboolean isSuccess = job.waitForCompletion(true);return isSuccess ? 0 : 1;}//step 4: run programpublic static void main(String[] args) throws Exception {int status = new WordCount().run(args);System.exit(status);}}
阅读全文
0 0
- MapReduce自带wordcount的实现
- Eclipse下运行hadoop自带的mapreduce程序--wordcount
- WordCount的MapReduce的实现
- MapReduce: WordCount的Eclipse实现
- 并行wordcount的MapReduce实现
- MapReduce实现简单的Wordcount
- hadoop自带的WordCount程序
- hadoop自带的wordcount小案例
- hadoop基础----hadoop实战(三)-----hadoop运行MapReduce---对单词进行统计--经典的自带例子wordcount
- MapReduce 的简单例子 WordCount的实现
- 使用python实现MapReduce的wordcount实例
- Hadoop-MapReduce之WordCount的实现
- MapReduce实现WordCount
- mapreduce实现wordcount
- hadoop自带示例wordcount
- Oozie自带MapReduce示例的运行
- 分析Hadoop自带WordCount例子的执行过程
- 分析Hadoop自带WordCount例子的执行过程(1)
- 获取当前iOS版本号
- 使用 netstat 实时监控IP连接数
- 《高效程序员的45个习惯——敏捷开发修炼之道》
- sqlite数据库介绍一
- IntelliJ IDEA 2017 打包jar和相关问题处理
- MapReduce自带wordcount的实现
- Scala语法(-)
- 阿里百川HotFix2.0热修复初体验
- MySQL数据库表名、列名、别名区分大小写的问题
- LeetCode-60. Permutation Sequence
- hihoCoder 1014 : Trie树
- 【BUG】 ContainerBase.addChild---IDEA 和Tomcat部署非maven的web项目
- Spring4第二讲学习笔记,配置文件详解
- Intellij IDEA 下插件MyBatisCodeHelper