文章标题

来源:互联网 发布:线切割hl编程视频教程 编辑:程序博客网 时间:2024/06/11 22:10

Hadoop系列学习-MapReduce的排序与自定义排序

默认排序

由于Hadoop默认是根据key去排序的。


实现效果:
排序前:
1991 06
1991 08
1991 07
1989 01
1979 02
1990 03
2000 04
排序后:
1979 1979 02
1989 1989 01
1990 1990 03
1991 1991 06
1991 1991 08
1991 1991 07
2000 2000 04
是针对第一列的key进行排序。

Mappublic static class ComparedDefaultMap extends Mapper<LongWritable, Text, LongWritable, Text>{        String line;        @Override        protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {            line = value.toString();            String [] str = line.split(" ");            if(str.length > 1){                long l = Long.parseLong(str[0]);                context.write(new LongWritable(l), value);            }        }    }

Reduce阶段

public static class ComparedDefaultReduce extends Reducer<LongWritable, Text, LongWritable, Text>{        @Override        protected void reduce(LongWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException {            for (Text value: values){                context.write(key, value);            }        }    }

根据默认排序的结果
1979 1979 02
1989 1989 01
1990 1990 03
1991 1991 06
1991 1991 08
1991 1991 07
2000 2000 04
但是希望获得的是:

1979 1979 02
1989 1989 01
1990 1990 03
1991 1991 06
1991 1991 07
1991 1991 08
2000 2000 04


所以需要自定义排序

自定义Writable

/** * 自定义Writable * Created with IntelliJ IDEA. * User: Administrator * Date: 15-5-20 * Time: 上午11:17 * To change this template use File | Settings | File Templates. */public class ComparedKey implements WritableComparable<ComparedKey> {    long first;    long second;    public ComparedKey() {    }    public ComparedKey(long first, long second) {        this.first = first;        this.second = second;    }    @Override    public int compareTo(ComparedKey o) {        long min = first - o.first;        if(min != 0){            return (int)min;        }        return (int)(second-o.second);    }    @Override    public void write(DataOutput out) throws IOException {        out.writeLong(first);        out.writeLong(second);    }    @Override    public void readFields(DataInput in) throws IOException {        first=in.readLong();        second=in.readLong();    }}

Map阶段

public static class ComparedMapper extends Mapper<LongWritable, Text, ComparedKey, Text>{        String line;        @Override        protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, ComparedKey, Text>.Context context) throws IOException, InterruptedException {            line = value.toString();            System.out.println(line);            ComparedKey comparedKey = null;            String [] str = line.split(" ");            if(str.length>2){                long first = Long.parseLong(str[0]);                long second = Long.parseLong(str[1]);                System.out.println("SUCCESS");                comparedKey = new ComparedKey(first, second);            }            if(comparedKey != null){                System.out.println("write SUCCESS");                context.write(comparedKey, new Text(line));            }        }    }

Reduce阶段

 public static class CompareReducer extends Reducer<ComparedKey, Text, LongWritable, Text>{        @Override        protected void reduce(ComparedKey key, Iterable<Text> values, Reducer<ComparedKey, Text, LongWritable, Text>.Context context) throws IOException, InterruptedException {            for(Text text : values){                System.out.println();                context.write(new LongWritable(key.first), text);            }        }    }

执行Job中

 Job job = new Job(configuration, "compared");        job.setJarByClass(ComparedToTest.class);        job.setMapperClass(ComparedMapper.class);        job.setReducerClass(CompareReducer.class);        job.setMapOutputKeyClass(ComparedKey.class);        job.setOutputKeyClass(LongWritable.class);        job.setOutputValueClass(Text.class);

最后结果:

1979 1979 02
1989 1989 01
1990 1990 03
1991 1991 06
1991 1991 07
1991 1991 08
2000 2000 04

0 0
原创粉丝点击