hadoop学习-wordcount程序c++重写执行
来源:互联网 发布:类似p2p的软件 编辑:程序博客网 时间:2024/05/01 19:12
1、程序执行命令:
hadoop pipes -D hadoop.pipes.java.recordreader=true -D hadoop.pipes.java.recordwriter=true -input /input/wordcount/sample.txt -output /output/wordcount -program /bin/wordcount
2、具体代码:
wordcount.h
#include <algorithm> #include <stdint.h>#include <string>#include <vector>#include "Pipes.hh" #include "TemplateFactory.hh" #include "StringUtils.hh" #include <iostream>using namespace std;class WordcountMapper : public HadoopPipes::Mapper { public: WordcountMapper(HadoopPipes::TaskContext& context);vector<string> split(const string& src, const string& separator);//重写map程序 void map(HadoopPipes::MapContext& context);}; class WordcountReducer : public HadoopPipes::Reducer { public: WordcountReducer(HadoopPipes::TaskContext& context); //重写reduce程序 void reduce(HadoopPipes::ReduceContext& context);};
wordcount.cpp
#include "wordcount.h"WordcountMapper::WordcountMapper(HadoopPipes::TaskContext& context){}void WordcountMapper::map(HadoopPipes::MapContext& context){ int count = 1; string line = context.getInputValue(); vector<string> wordVec = split(line, " ");for(unsigned i=0; i<wordVec.size(); i++){context.emit(wordVec[i], HadoopUtils::toString(count));}} vector<string> WordcountMapper::split(const string& src, const string& separator){ vector<string> dest; string str = src; string substring; string::size_type start = 0, index = 0;while(index != string::npos) { index = str.find_first_of(separator,start); if (index != string::npos) { substring = str.substr(start,index-start); dest.push_back(substring); start = str.find_first_not_of(separator,index); if (start == string::npos) return dest; } } substring = str.substr(start); dest.push_back(substring); return dest;} WordcountReducer::WordcountReducer(HadoopPipes::TaskContext& context){}void WordcountReducer::reduce(HadoopPipes::ReduceContext& context){ int wSum = 0; while (context.nextValue()){ wSum = wSum + HadoopUtils::toInt(context.getInputValue()) ; } context.emit(context.getInputKey(), HadoopUtils::toString(wSum)); }
main.cpp
/***hadoop的mapreduce框架,其中map数据准备,reduce数据汇总,我们通过客户端程序和hadoop集群进行通信**一般java程序使用streaming标准流,c++使用socket套接字**map在数据节点上执行,本地化无网络占用;reduce由map提供数据(map输出=reduce输入),网络占用较高;**一般java入参可以用int和string,但c++只有string,虽然简化接口,但很多时候需要人工转化*************************************hadoop集群执行word统计功能**hadoop执行task方法:runTask**hadoop执行job命令:mapredjob.sh**HDFS数据文件:/input/wordcount/sample.txt**HDFS输出文件:/output/wordcount */#include "wordcount.h"int main(int argc, char *argv[]) { return HadoopPipes::runTask(HadoopPipes::TemplateFactory<WordcountMapper, WordcountReducer>()); }
makefile程序:
.SUFFIXES:.h .c .cpp .oCC=g++CPPFLAGS = -m64 RM = rmSRCS = wordcount.cpp main.cppPROGRAM = wordcountOBJS=$(SRCS:.cpp=.o)INC_PATH = -I$(HADOOP_DEV_HOME)/includeLIB_PATH = -L$(HADOOP_DEV_HOME)/lib/nativeLIBS = -lhadooppipes -lcrypto -lhadooputils -lpthread#$?表示依赖项 $@表示目的项$(PROGRAM):$(OBJS)$(CC) $? -Wall $(LIB_PATH) $(LIBS) -g -O2 -o $@$(OBJS):$(SRCS)$(CC) $(CPPFLAGS) -c $(SRCS) $(INC_PATH).PHONY:cleanclean:$(RM) $(PROGRAM) $(OBJS)
源数据:
Happiness is not about being immortal nor having food or rights in one's hand. It??s about having each tiny wish come true, or having something to eat when you are hungry or having someone's love when you need love
Happiness is not about being immortal nor having food or rights in one's hand. It??s about having each tiny wish come true, or having something to eat when you are hungry or having someone's love when you need love
Happiness is not about being immortal nor having food or rights in one's hand. It??s about having each tiny wish come true, or having something to eat when you are hungry or having someone's love when you need love
Happiness 3
It��s 3
about 6
are 3
being 3
come 3
each 3
eat 3
food 3
hand. 3
having 12
hungry 3
immortal 3
in 3
is 3
love 6
need 3
nor 3
not 3
one's 3
or 9
rights 3
someone's 3
something 3
tiny 3
to 3
true, 3
when 6
wish 3
you 6
- hadoop学习-wordcount程序c++重写执行
- Hadoop平台执行Wordcount程序
- hadoop 第一个程序wordcount执行过程
- Hadoop示例程序WordCount源码学习
- hadoop学习之WordCount程序升级版
- hadoop执行wordcount例子
- Eclipse执行Hadoop WordCount
- hadoop 执行Wordcount详解
- Linux执行 Hadoop WordCount
- Eclipse重写Wordcount类实现处理中文字符,利用hadoop Eclipse插件远程调试hadoop运行WordCount程序
- hadoop运行WordCount程序
- Hadoop Wordcount 程序 详解
- Hadoop 运行 Wordcount程序
- Hadoop wordcount程序说明
- HADOOP的学习笔记 (第四期) eclipse 执行 wordcount
- HADOOP的学习笔记 (第四期) eclipse 执行 wordcount .
- wordcount程序执行记录
- hadoop学习笔记(三)mapreduce程序wordcount
- asm volatile内嵌汇编用法
- 描述在浏览器中输入一个网址后的发生的事情
- 布丰投针问题
- 蜕变之路
- Vijos P1443 银河英雄传说个人通过/递交:11/25(44%)
- hadoop学习-wordcount程序c++重写执行
- 链表的正逆序打印 java语言实现
- Binary Tree Maximum Path Sum--一次没A过的题
- pat 1067. Sort with Swap(0,*) (25)
- Linux下ps aux解释
- Java高新技术怎么样
- 在GitHub上挖了两个坑
- 中秋阳澄情,“谢”礼送不停
- postgresql pdo执行query返回false