SortedPriceName排序代码(基于mapreduce处理逻辑)
来源:互联网 发布:粒子群算法工具箱教程 编辑:程序博客网 时间:2024/05/23 02:25
SortedPriceName排序代码(基于mapreduce处理逻辑)
mapper.java
package com.doggie.test;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.lib.input.FileSplit;import java.io.IOException;/** * Created by root on 5/25/16. */public class mapper extends Mapper<Object,Text,LongWritable,Text> { public void map(Object key,Text value,Context context) throws IOException,InterruptedException{ String fileName = ((FileSplit)context.getInputSplit()).getPath().toString(); String valueString= value.toString(); String[] items=valueString.split(" "); LongWritable outputKey = null; Text outputValue=null; if(fileName.contains("price")){ outputKey = new LongWritable(Long.valueOf(items[0])); outputValue = new Text(items[1]); }else{ outputKey = new LongWritable(Long.valueOf(items[1])); outputValue = new Text("name" + items[0]); } context.write(outputKey,outputValue); }}---reducer.java---package com.doggie.test;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileSplit;import java.io.IOException;import java.util.TreeSet;/** * Created by root on 5/25/16. */public class reducer extends Reducer<LongWritable,Text,Text,LongWritable> { public void reduce(LongWritable key, Iterable<Text>values, Context context) throws IOException, InterruptedException { Text itemName = null; TreeSet<LongWritable> queue = new TreeSet<LongWritable>(); for (Text val : values){ if(val.toString().startsWith("name")){ String realName = val.toString().substring(4); itemName = new Text(realName); }else{ LongWritable price = new LongWritable(Long.valueOf(val.toString())); queue.add(price); } } for (LongWritable val : queue) { context.write(itemName, val); } }}---main---package com.doggie.mtest;import com.doggie.test.mapper;import com.doggie.test.reducer;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.util.GenericOptionsParser;/** * Created by root on 5/25/16. */public class Homework { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); String[] otherArgs = new GenericOptionsParser(conf, args) .getRemainingArgs(); if (otherArgs.length != 2) { System.err.println("Usage: homework"); System.exit(2); } //conf.setInt("mapred.task.timeout",100); Job job = new Job(conf, "homework"); job.setInputFormatClass(TextInputFormat.class); job.setJarByClass(Homework.class); job.setMapperClass(mapper.class); job.setReducerClass(reducer.class); job.setMapOutputKeyClass(LongWritable.class); job.setMapOutputValueClass(Text.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(LongWritable.class); job.setNumReduceTasks(1); FileInputFormat.addInputPath(job, new Path(otherArgs[0])); FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); }}[两个输入文件样本](https://yunpan.cn/cSDYkqREN9N3H 访问密码 ff1b)
1 0
- SortedPriceName排序代码(基于mapreduce处理逻辑)
- 基于MapReduce的二次排序
- MapReduce处理二次排序(分区-排序-分组)
- 基于erlang的mapreduce排序之一
- 基于MapReduce实现决策树算法代码
- 快速排序代码及逻辑思想
- Hadoop MapReduce处理海量小文件:基于CombineFileInputFormat(整个小文件读入到map中)
- Hadoop MapReduce处理海量小文件:基于CombineFileInputFormat(每次往map中读入1行)
- 实践Demo:基于MapReduce的PageRank网页排序
- MapReduce----辅助排序(二次排序)
- 基于Struts2环境下的文件上传实例(很简洁的逻辑和代码)
- Hadoop MapReduce处理海量小文件:基于CombineFileInputFormat
- Hadoop MapReduce处理海量小文件:基于CombineFileInputFormat
- Hadoop MapReduce处理海量小文件:基于CombineFileInputFormat
- Hadoop MapReduce处理小的压缩文件:基于CombineFileInputFormat
- Hadoop MapReduce处理海量小文件:基于CombineFileInputFormat
- 排序--MapReduce
- 排序--MapReduce .
- Google官方 详解 Android 性能优化【史诗巨著之内存篇】
- C++学习笔记二十二 - 简单的结构体struct
- 对象序列化至硬盘
- Weighted job scheduling
- hadoop自带wordcount代码详解
- SortedPriceName排序代码(基于mapreduce处理逻辑)
- SortedWordCount源代码以及过程分析
- 指针变量与变量的比较
- 答大二学生:跟着自己的兴趣定方向
- 【Java】 面向对象编程
- Java变量的声明、初始化和作用域
- Java字符串概述
- Java一维数组的声明、初始化和引用
- Resize Instance 操作详解 - 每天5分钟玩转 OpenStack(41)