setPartitionerClass, setOutputKeyComparatorClass and setOutputValueGroupingComparator
来源:互联网 发布:淘宝闺蜜网址 编辑:程序博客网 时间:2024/05/17 22:34
from http://autofei.wordpress.com/2012/10/18/setpartitionerclass-setoutputkeycomparatorclass-and-setoutputvaluegroupingcomparator/
setPartitionerClass, setOutputKeyComparatorClass and setOutputValueGroupingComparator
Partitioner decides which mapper output goes to which reduer based on mapper output key. In general, different key is in different group (Iterator at the reducer side). But sometimes, we want different key is in the same group. This is the time for Output Value Grouping Comparator, which is used to group mapper output. For easy understanding, think this is the group by condition in SQL. I will give a detail example for time serial analysis later. Output Key Comparator is used during sort stage for the mapper output key.
The above looks pretty straight forward. But there is one thing to remember: if you use setOutputValueGroupingComparator, all the key in the same group at reducer side will be same now even they are not the same at the mapper output.
You can download the example from: https://www.assembla.com/spaces/autofei_public/documents
- record.txt is the input (three fields, year, an random number, place)
- MaxTemperatureUsingSecondarySort.java is the main hadoop code
- IntPair.java is the mapper output key object
- output.txt is the output
You will notice that number for the same year is the same now, the max one.
Note: the code is modified from book “Hadoop The Tefinitive Guide”
注意,new MapReduce API中,setOutputKeyComparatorsClass 对应为setSortComparatorsClass
setOutputValueGroupingComparatorsClass对应为setGroupingComparatorClass
- setPartitionerClass, setOutputKeyComparatorClass and setOutputValueGroupingComparator
- setPartitionerClass、setOutputKeyComparatorClass 与 setOutputValueGroupingComparator
- setPartitionerClass、setOutputKeyComparatorClass 与 setOutputValueGroupingComparator
- setPartitionerClass、setOutputKeyComparatorClass 与 setOutputValueGroupingComparator
- setOutputValueGroupingComparator与setOutputKeyComparatorClass
- " and '
- $* and $@
- AND
- AND
- & and &&
- ""and ‘’
- GetMessage () and PeekMessage () and SendMessage () and PostMessage ()
- eval and $()/``and typeset and xargs
- # and #line and __FILE__ and __LINE__
- 这样的东东怎么翻译:and and and And?
- Metalearning and ...?
- .And
- between...and
- HHVM 是如何提升 PHP 性能的?
- 文件属性----解除锁定(Windows)
- 查看模拟器ip
- Eclipse背景颜色设置(设置成豆沙绿色保护眼睛,码农保护色)
- 关于 httpUrlConnection 的 setDoOutput 与 setDoInput
- setPartitionerClass, setOutputKeyComparatorClass and setOutputValueGroupingComparator
- 把程序源代码的文件编码统一为UTF8,行结束符使用\n,不要再用Windows下的记事本工具。
- 01背包
- 减少GC开销的5个编码技巧
- Makefile for out of source build
- C++11 Synchronization Benchmark
- linux驱动学习之内核线程分析
- java公开密钥(N,e)的生成算法
- 整理:国内主流云计算方案比较