Ubuntu12.04下hadoop-2.7.0编译-配置

来源:互联网 发布:大雄的生化危机 知乎 编辑:程序博客网 时间:2024/04/30 09:26

首先在hadoop官网下载hadoop源文件,本文下载的源文件为 hadoop-2.7.0-src.tar.gz,解压到安装目录,阅读hadoop-2.7.0-src目录下得BUILDING.txt 文件。

(1) 准备

按要求准备java等工具,其中protobuf要安装2.5版本以上的,最好是下载protobuf-2.5.0.tar.gz文件手动安装。

protobuf-2.5.0安装:在https://code.google.com/p/protobuf/downloads/list下载protobuf-2.5.0.tar.gz文件,在国内可能不能直接登录,可以再baidu云盘中搜索protobuf-2.5.0.tar.gz下载,

然后,解压protobuf-2.5.0.tar.gz到用户自己指定的文件夹,本文为/home/liliang/protobuf

tar zxvf protobuf-2.5.0.tar.gzcd  protobuf-2.5.0./configuresudo make sudo make install

(2)编译hadoop 2.7.0

 mvn package -Pdist -DskipTests -Dtar

具体过程和其它编译选择参考BUILDING.txt 文件。

编译过程需要较长时间,编译好后,将hadoop-dist下得hadoop-2.7.0文件拷贝到部署目录,例如本文中为/root/data_test

cp -r hadoop-2.7.0/ root@bigdatatest-2:/root/data_test

(3)配置hadoop-2.7.0

hadoop节点如下:

10.0.96.51  bigdatatest-110.0.96.52  bigdatatest-2

 1) 节点间无密码登录

在bigdatatest-1上

ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

 将结果同步到bigdatatest-2

scp .ssh/authorized_keys root@bigdatatest-2:/root/.ssh

2) 设置两个节点上的/etc/hosts文件

#127.0.1.1 bigdatatest-x.localdomain bigdatatest-x#127.0.0.1 localhost # The following lines are desirable for IPv6 capable hosts #::1 ip6-localhost ip6-loopback #fe00::0 ip6-localnet #ff00::0 ip6-mcastprefix #ff02::1 ip6-allnodes #ff02::2 ip6-allrouters #ff02::3 ip6-allhosts 10.0.96.51  bigdatatest-1 10.0.96.52  bigdatatest-2

   3) 配置环境变量

/etc/profile文件增加如下内容

#HADOOP VARIABLES STARTexport HADOOP_INSTALL=/root/data_test/hadoop-2.7.0export PATH=$PATH:$HADOOP_INSTALL/binexport PATH=$PATH:$HADOOP_INSTALL/sbinexport HADOOP_MAPRED_HOME=$HADOOP_INSTALLexport HADOOP_COMMON_HOME=$HADOOP_INSTALLexport HADOOP_HDFS_HOME=$HADOOP_INSTALLexport YARN_HOME=$HADOOP_INSTALLexport HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/nativeexport HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"#HADOOP VARIABLES END

 使配置生效

source /etc/profile
进入hadoop的安装目录,
cd data_test/hadoop-2.7.0/etc/hadoop

做如下所示配置

修改hadoop-env.sh文件  sudo vim hadoop-env.sh 

# The java implementation to use.#export JAVA_HOME=${JAVA_HOME}export JAVA_HOME=/usr/lib/jvm/java-7-oracle

修改core-site.xml

<configuration>        <property>                <name>fs.default.name</name>                <value>hdfs://bigdatatest-1:9000</value>        </property>        <property>                <name>hadoop.tmp.dir</name>                <value>/root/data_test/hadoop-2.7.0/hadoop-${user.name}</value>        </property></configuration>

修改yarn-site.xml

<configuration><span style="white-space:pre"></span><!-- Site specific YARN configuration properties --><property><name>yarn.nodemanager.aux-services</name><span style="white-space:pre"></span><value>mapreduce_shuffle</value></property><property><name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name><value>org.apache.hadoop.mapred.ShuffleHandler</value></property><property><name>yarn.resourcemanager.hostname</name><value>bigdatatest-1</value></property></configuration>

修改mapred-site.xml

<configuration><property><name>mapreduce.framework.name</name><value>yarn</value><final>true</final></property></configuration>

配置salves

#bigdatatest-1bigdatatest-2

同步hadoop安装文件到其它主机

cd /root/data_testscp -r hadoop-2.7.0/ root@bigdatatest-2:/root/data_test

格式化HDFS

cd ~/data_test/hadoop-2.7.0/ ./bin/hdfs namenode -format

启动hadoop集群

./sbin/start-dfs.sh ./sbin/start-yarn.sh


access http://10.0.96.51:50070

for verification.


(4)验证

1)在HDFS上创建input目录bin/hadoop fs -mkdir -p input

2)将本地README.txt作为输入传输到input中
bin/hadoop fs -copyFromLocal README.txt input
3)运行WordCount
bin/hadoop jar share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.7.0-sources.jar org.apache.hadoop.examples.WordCount input output
4)运行完毕后查看WordCount统计结果

bin/hadoop fs -cat output/*
5)删除output文件
bin/hadoop dfs -rm -r output


0 0