hadoop-0.23.5 pipes测试

来源:互联网 发布:程序员 转行做什么 编辑:程序博客网 时间:2024/06/02 02:45

hadoop-0.23.5 pipes测试

一.g++

命令用法:

       gcc [-c|-S|-E] [-std=standard]

           [-g] [-pg] [-Olevel]

           [-Wwarn...] [-pedantic]

           [-Idir...] [-Ldir...]

           [-Dmacro[=defn]...] [-Umacro]

           [-foption...] [-mmachine-option...]

           [-o outfile] [@file] infile...

部分选项理解:


二.Hadoop部署

2.1 hadoop用户创建

~$sudo adduser hadoop

~$sudo visudo -f /etc/sudoers

ADD:

hadoop    ALL=(ALL:ALL) ALL

~$sudo passwd hadoop

~$su - hadoop

2.2 下载解压

~$sudo tar -xzf hadoop-0.23.5-src.tar.gz -C /usr/local/

~$sudo tar -xzf hadoop-0.23.5.tar.gz -C /usr/local/

2.3 权限更改

~$sudo chown -R hadoop:hadoop /usr/local/hadoop-0.23.5-src

~$sudo chown -R hadoop:hadoop /usr/local/hadoop-0.23.5

2.4 环境变量设置

~$sudo vim /etc/profile

ADD:

export HADOOP_HOME=/usr/local/hadoop-0.23.5

export JAVA_HOME=/usr/lib/jvm/java-6-openjdk

export HADOOP_COMMON_HOME=$HADOOP_HOME

export HADOOP_HDFS_HOME=$HADOOP_HOME

export HADOO_MAPRED_HOME=$HADOOP_HOME

export YARN_HOME=$HADOOP_HOME

export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop

export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

 

~$source /etc/profile

2.5 hadoop配置

~$cd $HADOOP_CONF_DIR

对以下文件进行配置。



2.6 启动

~$hdfs namenode -format                                                            

格式化一个名字空间

~$hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode          

启动namenode

~$hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start secondarynamenode 

启动secondarynamenode

~$hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode         

 启动datanode

~$yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager                   

启动rm

~$yarn-daemon.sh --config $HADOOP_CONF_DIR start nodemanager                       

启动nm

~$mr-jobhistory-daemon.sh start historyserver --config $HADOOP_CONF_DIR            

启动历史服务器

如果做成脚本会方便一点。

检查端口是否正常开启:

~$for a in 50090 50010 50075 50020 50070 10020 19888  8030 8031 8032 8033 8088 8042 8020; do netstat -nl | grep $a; done

如果所有端口都已经开启,则启动成功。

2.7 编译运行c++ wordcount程序

2.7.1 hadoop pipes设计原理:

请参阅博客:http://dongxicheng.org/mapreduce/hadoop-pipes-architecture/

2.7.2 hadoop pipes用法:

Usage: hadoop pipes [-conf ] [-jobconf , , ...] [-input ] [-output ] [-jar ] [-inputformat ] [-map ] [-partitioner ] [-reduce ] [-writer ] [-program ] [-reduces ]

选项说明:



2.7.3 编译运行c++ wordcount代码

复制wordcount例子代码:

~$cp /usr/local/hadoop-0.23.5-src/hadoop-tools/hadoop-pipes/src/main/native/examples/impl/wordcount-simple.cc ~/wordcount.cpp

创建makefile文件:

~$touch makefile

~$vim makefile

ADD

CC = g++

CPPFLAGS = -m32 -I/usr/local/hadoop-0.23.5-src/hadoop-tools/hadoop-pipes/src/main/native/pipes/api/ -I/usr/local/hadoop-0.23.5-src/hadoop-tools/hadoop-pipes/src/main/native/utils/api/ -L/usr/local/hadoop-0.23.5/lib/native -lhadooppipes  -lhadooputils -lpthread -lcrypto -lssl

 

wordcount:wordcount.cpp

         libtool --mode=link --tag=CXX $(CC) $(CPPFLAGS) $< -Wall -g -O2 -o $@

 

~$make

hdfs上创建目录:

~$hdfs dfs -mkdir /user/hadoop/wordcount/bin

~$hdfs dfs -mkdir /user/hadoop/wordcount/input

上传可执行文件:

~$hdfs dfs -copyFromLocal wordcount /user/hadoop/wordcount/bin

上传输入文件:

~$hdfs dfs -copyFromLocal word.txt /user/hadoop/wordcount/input

执行:

~$hadoop pipes \

 -D hadoop.pipes.java.recordreader=true \

 -D hadoop.pipes.java.recordwriter=true \

 -input /user/hadoop/wordcount/input/word.txt \

 -output /user/hadoop/wordcount/output \

 -program /user/hadoop/wordcount/bin/wordcount

 

报错:

org.apache.hadoop.mapred.pipes.OutputHandler.waitForAuthentication

 

原因可能是库文件需要在本定重新编译生成。

2.7.4 第二种运行wordcount的方法

~$cd /usr/local/hadoop-0.23.5-src/hadoop-tools/hadoop-pipes/src

~$cmake  .

~$make

此时,生成相应的hadooppipes,hadooputils库文件,和例子wordcount-simple

创建目录:

~$hdfs dfs -mkdir /user/hadoop/wordcount/bin

~$hdfs dfs -mkdir /user/hadoop/wordcount/input

上传可执行文件:

~$hdfs dfs -copyFromLocal wordcount-simple /user/hadoop/wordcount/bin

上传输入文件:

~$hdfs dfs -copyFromLocal word.txt /user/hadoop/wordcount/input

执行:

~$hadoop pipes \

 -D hadoop.pipes.java.recordreader=true \

 -D hadoop.pipes.java.recordwriter=true \

 -input /user/hadoop/wordcount/input/word.txt \

 -output /user/hadoop/wordcount/output \

 -program /user/hadoop/wordcount/bin/wordcount-simple

成功。

2.7.5 第三种运行wordcount的方法

这是利用第二种方法中生成的hadooppipes,hadooputils库文件。

~$cd /usr/local/hadoop-0.23.5-src/hadoop-tools/hadoop-pipes/src

~$cmake .

~$make

~$cp libhadooppipes.a libhadooputils.a /usr/local/hadoop-0.23.5/lib/native

后面的步骤和第一种相同。

成功。

 

 

 

 

 

 

原创粉丝点击