Hadoop C++ Pipes中context常见成员函数的作用
来源:互联网 发布:莱汀rei 知乎 编辑:程序博客网 时间:2024/06/06 12:47
getJobConf¶
Get the JobConf for the current task
getInputKey¶
Get the current key
getInputValue¶
Get the current value
In the reducer, context.getInputValue is not available till context.nextValue is called !
progress¶
This method simply phones home to the NameNode, letting it know that the mapper or reducer is still working and has not died or zombified.
setStatus¶
The status message can be found in the hadoop*tasktracker*.log and in the web interface as "Status".
1 context.setStatus("Teke-lili");
getCounter¶
The counter will be displayed in the Web interface. You will have to get it once on init of the class.
nextValue¶
Iterate over the values. Important: The key will be the same all the time !
context.getInputValue is not available till context.nextValue is called
例子:
假设输入文件是hello.txt
内容为:
hello world
hello bupt
程序为:
#include "hadoop/Pipes.hh"#include "hadoop/TemplateFactory.hh"#include "hadoop/StringUtils.hh"const std::string WORDCOUNT = "WORDCOUNT";const std::string INPUT_WORDS = "INPUT_WORDS";const std::string OUTPUT_WORDS = "OUTPUT_WORDS";class WordCountMap: public HadoopPipes::Mapper { // Mapper类public: HadoopPipes::TaskContext::Counter* inputWords; WordCountMap(HadoopPipes::TaskContext& context) { inputWords = context.getCounter(WORDCOUNT, INPUT_WORDS); } void map(HadoopPipes::MapContext& context) { std::vector<std::string> words = HadoopUtils::splitString(context.getInputValue(), " "); // 按空格进行单词分割 for(unsigned int i=0; i < words.size(); ++i) { context.emit(words[i], "1"); // 单词作为key,value为1 } context.incrementCounter(inputWords, words.size()); // 向map-reduce提交进度信息 }};class WordCountReduce: public HadoopPipes::Reducer { // reduce类public: HadoopPipes::TaskContext::Counter* outputWords; WordCountReduce(HadoopPipes::TaskContext& context) { outputWords = context.getCounter(WORDCOUNT, OUTPUT_WORDS); } void reduce(HadoopPipes::ReduceContext& context) { int sum = 0; while (context.nextValue()) { sum += HadoopUtils::toInt(context.getInputValue()); // 统计单词出现的次数 } context.emit(context.getInputKey(), HadoopUtils::toString(sum)); // 输出结果 context.incrementCounter(outputWords, 1); }};int main(int argc, char *argv[]) { return HadoopPipes::runTask(HadoopPipes::TemplateFactory<WordCountMap, WordCountReduce>()); // 运行任务}
一。MapContext:
内容为:
key->value
(1,hello word) 注:这里的1是该行的偏移量,具体值不一定是这个
(2,hello bupt)
getInpuptValue() 可以得到一行的value,例如头一次调用将得到:hello world
emit()
将以下内容写入
(hello,1)
(world,1)
(hell0,1)
(bupt,1)
二。ReduceContext:
以上一步的内容为输入,经过MapReduce框架处理以后得到,内容为:
(hello,[1,1]) 注:这里已经将key相同的value放到了一块
(world,1)
(bupt,1)
context.nextValue()将会前进到特定key的下一个Value
- Hadoop C++ Pipes中context常见成员函数的作用
- Hadoop C++ Pipes中context常见成员函数的作用
- Hadoop中Context类的作用
- hadoop pipes模式中combiner的使用
- hadoop pipes 中遇到问题
- C++中静态成员函数的作用
- C中static的常见作用
- C/C++ 中static的常见作用
- C中static的常见作用
- c中static的常见作用
- C 中 static 的常见作用
- C 中 static 的常见作用
- hadoop c++ pipes接口实现
- Hadoop Pipes
- Hadoop pipes
- android中Context的作用
- Mapreduce中context的作用
- MapReduce中context的作用
- [Thought]1959年,伯特兰·罗素寄语未来
- node js
- 三层代码实现问题 “用户登录”
- Rsync命令详解
- Java实训笔记(八)之mysql
- Hadoop C++ Pipes中context常见成员函数的作用
- 生活随笔:人心难测,伤心地离开
- Linux Filesystem Hierarchy(/boot)
- 关于oracle 多表共用一个序列 还是多表多序列 小总结
- printf使用%s直接输出string类型,Program received signal SIGILL, Illegal instruction
- RTP协议分析
- spss clementine Twostep Cluster(两步聚类 二阶聚类)
- 关于URL请求传递中文参数乱码
- 一些函数的头文件