Pipes配置

来源:互联网 发布:kali centos 编辑:程序博客网 时间:2024/06/01 19:47

引文地址:http://www.linuxidc.com/Linux/2011-12/48509.htm

1.安装配置好Hadoop
常用命令:
Hadoop dfs -ls path
Hadoop dfs -rmr file
Hadoop dfs -mkdir path
Hadoop dfs -cat file
2.找个wordcount程序,命名为wordcount.cpp
可以是:http://wiki.apache.org/Hadoop/C++WordCount


也可以是Hadoop安装路径下的:/usr/local/hadoop-0.20.2/src/examples/pipes/impl/wordcount-simple.cc
3.写Makefile
Hadoop_INSTALL=/usr/local/hadoop-0.20.2
PLATFORM=Linux-i386-32

CC = g++
CPPFLAGS = -m32 -I$(Hadoop_INSTALL)/c++/$(PLATFORM)/include

wordcount: wordcount.cpp
$(CC) $(CPPFLAGS) $< -Wall -L$(Hadoop_INSTALL)/c++/$(PLATFORM)/lib -lhadooppipes -lhadooputils -lpthread -g -O2 -o $@
###
cat /proc/cpuinfo 查看cpu是intel的还是amd的,对应修改PLATFORM。

注意:编译生成可执行文件时需要做以下两步:

    1.安装openssl开发包,引文引用的静态库hadooppipes中使用了openssl技术

       sudo apt-get install openssl-dev

    2.makefile文件中需要添加 -lcrypto


4.执行:
上传wordcount.cpp文件作为输入文件:Hadoop fs -put wordcount.cpp input.txt
上传可执行文件: Hadoop fs -put wordcount bin/wordcount
运行代码:
Hadoop pipes \
-D Hadoop.pipes.java.recordreader=true \
-D Hadoop.pipes.java.recordwriter=true \
-input input.txt \
-output output \
-program bin/wordcount
查看结果:
Hadoop dfs -cat output/*

原创粉丝点击