MapReduce_TotalSort示例
来源:互联网 发布:国家打击网络暴恐信息 编辑:程序博客网 时间:2024/06/05 15:09
示例代码:
package com.xfyan.four;import java.io.IOException;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.input.KeyValueTextInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import org.apache.hadoop.mapreduce.lib.partition.InputSampler;import org.apache.hadoop.mapreduce.lib.partition.InputSampler.RandomSampler;import org.apache.hadoop.mapreduce.lib.partition.TotalOrderPartitioner;public class TotalSort {public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {Path inputPath = new Path(args[0]);Path outputPath = new Path(args[1]);Path partitionFile = new Path(args[2]);int reduceNumber = Integer.parseInt(args[3]);//RandomSampler第一个参数表示被选中的概率,第二个参数是一个选取的样本数,第三个参数是最大读取的InputSplit数RandomSampler<Text,Text> sampler = new InputSampler.RandomSampler<Text, Text>(0.1, 100,10);Configuration conf = new Configuration();//设置作业分区文件路径TotalOrderPartitioner.setPartitionFile(conf,partitionFile);Job job = new Job();job.setJobName("TotalSort");job.setJarByClass(TotalSort.class);job.setInputFormatClass(KeyValueTextInputFormat.class);job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(Text.class);job.setNumReduceTasks(reduceNumber);//设置partition类job.setPartitionerClass(TotalOrderPartitioner.class);FileInputFormat.setInputPaths(job, inputPath);FileOutputFormat.setOutputPath(job, outputPath);outputPath.getFileSystem(conf).delete(outputPath,true);//写入分区文件InputSampler.writePartitionFile(job, sampler);System.out.println(job.waitForCompletion(true)?0:1);}}
0 0
- MapReduce_TotalSort示例
- 示例
- 示例
- 示例
- 示例
- 示例
- Winpcap示例,Libpcap示例
- CMSChart 示例
- DataGrid 示例
- 示例:Radio
- 示例:Calendar
- Sqlldr示例
- 按键示例
- SQL示例
- Singleton示例
- 串口示例
- JLabel示例
- JButton示例
- 在sqlplus中使用退格和上下左右
- POJ3190
- 391.Number of Airplanes in the Sky-数飞机(中等题)
- 机器学习算法--逻辑回归原理介绍
- Java SE基础知识
- MapReduce_TotalSort示例
- Url与Uri与URN的区别:
- 11.12
- Python time模块简介
- 学习安卓相关的网站
- Struts2 的jar包冲突如何解决?
- javascript中数组的sort()方法原理研究
- 二叉排序树
- BootStrap实战二之网站 Logo 添加