应用程序通过对键中的气温进行排序来找出最高气温

来源:互联网 发布:淘宝怎么申请二次售后 编辑:程序博客网 时间:2024/04/28 05:26

public  class MaxTemperatureUsingSecondarySort  extends Configured implements  Tools{

 

static class MaxTemperatureMapper extends  Mapper<LongWritable, Text, IntPair, NullWritable>{

 

private NcdcRecordParer parser = new NcdcRecordParser();

  @override

protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{

 parser.parse(value)

if(parser.isWalidTemperature()){

context.write(new IntPair(parser.getYearInt(), parser.getAirTemperature()), NullWritable.get());

}

}

}

static class MaxTemperatureReducer extends  Reducer<IntPair, NullWritable, IntPair,NullWritable>{

@Ovrride

protected void reduce(IntPair key, Iterable<NullWritable> values, Context coontext) throws IOException ,InterruptedException{

 context.write(key, NullWritable.get());

}

}

public static  class FirstPartitioner extends Partitioner<IntPair, NullWritableX>{

@Override

public int getPartition(IntPair key, NullWritable value, int numPartitions){

//multiply by 127 to perform some mixing

return Math.abs(key.getFirst() * 127 )  % numPartitions;

}

}

public static class keyComparator extends WritableComparator{

protected KeyComparator(){

supper(IntPair.class, true);

}

@Override

public int compare(WritableComparable w1,Writab;leComparable w2){

IntPair ip1 =(IntPair) w1;

IntPair ip2 =(IntPair) w2;

int cmp = IntPair.compare(ip1.getFirst(), ip2.getFilrst());

if(cmp!=0)

return cmp

return  -IntPair.compare(ip1.getSecond(), ip2.getSeconde());

}

}

public  static class GroupComparator extends WritableComparator{

protected GroupComparator (){

super(IntPair.class, true);

}

@Ovrride

public int compare(WritableComparable w1, WritableComparable w2 ){

 

IntPair ip1 =(IntPair) w1;

IntPair ip2 =(IntPair) w2;

int cmp = IntPair.compare(ip1.getFirst(), ip2.getFilrst());

if(cmp!=0)

return cmp

}

}

 

@Override

public int run(String[] args){

  Job job = new Job();

job.setMapperClass(MaxTemperatureMapper.class)

job.setPartitionerClass(FirstPartitioner.class)

job.setSortComparatorClass(KeyComparator.class)

job.setGroupingComparatorClass(GroupComparator.class)

job.setReducerClass(MaxTemp (eratureReducer.class)

 

job.setOutputKeyClass(IntPair.class);

job.setOutputValueClass(NullWritable.class)

 

return  job.waitForCompletion(true0 ? 0:1;

 

====================================================================================================

在上述的mapper中我们利用IntPair定义了一个代表年份和气温的组合键, 该类实现了Writable接口

,IntPair于TextPair类相似,后者可以传递Text,由于可以根据各个reduce的组合件获取最高气温

因此无需要在值上附加其他信息, 使用NullWritable即可, 根据辅助排序, reduce输出第一个键就是包含年份和最高气温的信息的IntPair对象。

IntPair的toString()方法返沪iyige以制表符分割的字符串,因而该陈旭输出一组由制表符分割的年份、气温对

 

我们创建了一个自定义的partitioner以按照组合键的首字段(年份)进行分区,即FirstPartitioner

为了按照年份(升序)和气温(降序)排列键,我们使用setSortComparatorClass 设置了一个自定义键 comparator(即KeyComparator)

以抽取字段并执行比较操作。类似的

为了按照年份对键进行分组,我们使用SetGroupingComparatorClass来自定义一个分组comparator, 只取键的首字段进行比较。

 

 

运行这个程序 ,返回各年的最高气温

hadoop jar hadoop-examples.jar MaxTemperatureUsingSecondarySort input/ncdc/all    output-secondarysort

hadoop fs -cat output-secondarysort/part-* | sort | head

 

 

 

 

hadoop权威指南 page= 302

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0 0
原创粉丝点击