hadoop学习3
来源:互联网 发布:youtube显示无网络连接 编辑:程序博客网 时间:2024/06/05 08:15
1.文件按照行来去重
import java.io.IOException;import org.apache.hadoop.fs.Path;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;public class Dedup { public static class Map extends Mapper<Object,Text,Text,Text>{ private static Text line = new Text(); @Override protected void map(Object key, Text value, Mapper<Object, Text, Text, Text>.Context context) throws IOException, InterruptedException { line = value; context.write(line,new Text("")); } } public static class Reduce extends Reducer<Text,Text,Text,Text>{ @Override protected void reduce(Text key, Iterable<Text> values, Reducer<Text, Text, Text, Text>.Context context) throws IOException, InterruptedException { context.write(key,new Text("")); } } public static void main(String[] args) throws Exception{ Configuration conf = new Configuration(); Job job = new Job(conf, "afan"); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); FileInputFormat.setInputPaths(job,new Path("hdfs://node1:9000/afan/input")); FileOutputFormat.setOutputPath(job, new Path("hdfs://node1:9000/afan/output")); boolean ret = job.waitForCompletion(true); System.exit(ret ? 0:1); }}
阅读全文
0 0
- hadoop学习3
- Hadoop学习记录-3
- hadoop学习(3)
- Hadoop学习笔记(3)
- Hadoop学习笔记 3
- hadoop学习3
- hadoop学习3
- hadoop学习笔记(3) 初识Hadoop
- hadoop学习笔记(3) 初识Hadoop
- hadoop学习笔记(3) 初识Hadoop
- Hadoop学习3:Hadoop核心组件-MapReduce
- Hadoop学习(3)----HDFS API
- 【hadoop学习】Hadoop配置
- Hadoop学习-hadoop常用命令
- hadoop正式学习---hadoop
- hadoop 学习
- Hadoop学习
- hadoop学习
- 输入某年某月某日,判断这一天是这一年的第几天?
- Run-Time Check Failure #2
- 采用mustache语言作为elasticsearch搜索请求的预处理模板
- 在eclipse如何设置新建JSP文件的默认字符编码为UTF-8?
- Unity3d--GUI自适应矩阵(通过Matrix4x4.SetTRS)
- hadoop学习3
- [转载]图论500题
- 计算机网络第一节
- unity 如何实现安卓Android的toast功能
- ubuntu安装 nautilus(右键单击出现打开终端的插件)
- php CI3.0控制器多级目录支持
- 视图
- Oracle 用户、对象权限、系统权限
- JSP