[大数据-hadoop]hadoop2.7.4在Ubuntu12.0.4下的编译

来源:互联网 发布:java 图片裁剪工具类 编辑:程序博客网 时间:2024/06/05 08:40

hadoop2.7.4在Ubuntu12.0.4下的编译

[大数据-hadoop]

编译步骤:
一、下载编译所需要的包(下列是我编译使用的软件版本)

1.hadoop-2.7.4-src.tar.gz(最新的稳定版本是2.7.4)
2.apache-ant-1.9.4-bin.tar.gz
3.protobuf-2.5.0.tar.gz
4.findbugs-3.0.1.tar.gz
5.apache-maven-3.2.5-bin.tar.gz
6.jdk-7u80-linux-x64.tar.gz

二、解压并配置软件和环境变量

# vim /etc/profileexport JAVA_HOME=/cloud/jdk1.7.0_80export JRE_HOME=/cloud/jdk1.7.0_80/jreexport MAVEN_HOME=/cloud/apache-maven-3.2.5export MAVEN_OPTS="-Xms256m -Xmx512m"export ANT_HOME=/cloud/apache-ant-1.9.4export FINDBUGS_HOME=/cloud/findbugs-3.0.1export PATH=$PATH:$JAVA_HOME/bin:$MAVEN_HOME/bin:$ANT_HOME/bin:$FINDBUGS_HOME/bin:$JRE_HOME/binexport CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar# source /etc/profile

三、编译过程

step1:# cd /cloud/src/hadoop2.7/hadoop-2.7.4-src# cd hadoop-maven-plugins/# mvn installstep2:# cd /cloud/src/hadoop2.7/hadoop-2.7.4-src# mvn package -Pdist -DskipTests -Dtar注意:这步的编译时间很长,如果中间出现长时间没响应,请ctrl+c中断命令,重新执行,直到 BUILD SUCCESS。仅仅是为了编译源码,而不是要调试源码。到这步就够了,生成的文件路径在:/cloud/src/hadoop2.7/hadoop-2.7.4-src/hadoop-dist/target/hadoop-2.7.4.tar.gzstep3:# cd /cloud/src/hadoop2.7/hadoop-2.7.4-src# mvn eclipse:eclipse -DskipTests该步骤方便看看原理,调试源代码,编译完成后,导入maven文件即可(本人尚未试过)。

四、hadoop2.7.4安装
1.准备软件

jdk1.7.0_80
hadoop-2.7.4.tar.gz
zookeeper-3.3.6.tar.gz

2.解压并设置环境变量(过程不再简述,直接看配置文件)

# vim /etc/profileexport JAVA_HOME=/cloud/jdk1.7.0_80export ZOOKEEPER_HOME=/cloud/zookeeper-3.3.6export HADOOP_HOME=/cloud/hadoop-2.7.4export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$ZOOKEEPER_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbinexport CALSSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

3.免密ssh登录

  $ ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa  $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys  $ chmod 0600 ~/.ssh/authorized_keys

4.配置hadoop2.7.4的配置文件
hadoop-env.sh
hdfs-site.xml
core-site.xml
yarn-site.xml
slaves
如下是本人的配置文件
4.1 设置hadoop-env.sh中的JAVA_HOME目录

# cd /cloud/hadoop-2.7.4/etc/hadoop# vim hadoop-env.shexport JAVA_HOME=/cloud/jdk1.7.0_80

4.2 hdfs-site.xml配置

<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--  Licensed under the Apache License, Version 2.0 (the "License");  you may not use this file except in compliance with the License.  You may obtain a copy of the License at    http://www.apache.org/licenses/LICENSE-2.0  Unless required by applicable law or agreed to in writing, software  distributed under the License is distributed on an "AS IS" BASIS,  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and  limitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration>    <property>        <name>dfs.replication</name>        <value>3</value>    </property>    <!--dfs.nameservice configure mycluster -->    <property>        <name>dfs.nameservices</name>        <value>mycluster</value>    </property>    <!-- dfs.ha.namenodes.[namservice ID] configure unique identifies for each NameNode -->    <property>    <name>dfs.ha.namenodes.mycluster</name>        <value>nn1,nn2</value>    </property>    <!-- dfs.namenode.rpc-address.[nameservice ID].[name node ID]  -->    <property>      <name>dfs.namenode.rpc-address.mycluster.nn1</name>      <value>202.96.64.8:8020</value>    </property>    <property>       <name>dfs.namenode.rpc-address.mycluster.nn2</name>       <value>202.96.64.10:8020</value>    </property>    <!-- dfs.namenode.http-address.[nameservice ID].[name node ID] the fully-qualified HTTP address for each NameNode to listen on  Similarly to rpc-address above -->    <property>      <name>dfs.namenode.http-address.mycluster.nn1</name>      <value>202.96.64.8:50070</value>    </property>    <property>       <name>dfs.namenode.http-address.mycluster.nn2</name>       <value>202.96.64.10:50070</value>    </property>    <!-- the URI which identifies the group of JNs where the NameNodes will write/read edits -->    <property>      <name>dfs.namenode.shared.edits.dir</name>      <value>qjournal://202.96.64.10:8485;202.96.64.12:8485;202.96.64.14:8485/mycluster</value>    </property>    <property>      <name>dfs.client.failover.proxy.provider.mycluster</name>      <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>    </property>    <property>        <name>dfs.ha.fencing.methods</name>        <value>sshfence</value>    </property>    <property>        <name>dfs.ha.fencing.ssh.private-key-files</name>        <value>/root/.ssh/id_rsa</value>    </property>    <property>        <name>dfs.ha.fencing.methods</name>        <value>sshfence</value>    </property>    <property>        <name>dfs.ha.fencing.ssh.connect-timeout</name>        <value>30000</value>    </property>    <property>        <name>dfs.journalnode.edits.dir</name>        <value>/opt/jn/data</value>    </property>    <!-- configure automatic failover -->    <property>        <name>dfs.ha.automatic-failover.enabled</name>        <value>true</value>    </property>    <property>        <name>ha.zookeeper.quorum</name>        <value>202.96.64.8:2181,202.96.64.10:2181,202.96.64.12:2181</value>    </property></configuration>

4.3core-site.xml配置

<?xml version="1.0" encoding="UTF-8"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--  Licensed under the Apache License, Version 2.0 (the "License");  you may not use this file except in compliance with the License.  You may obtain a copy of the License at    http://www.apache.org/licenses/LICENSE-2.0  Unless required by applicable law or agreed to in writing, software  distributed under the License is distributed on an "AS IS" BASIS,  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and  limitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --><configuration>    <property>      <name>fs.defaultFS</name>      <value>hdfs://mycluster</value>    </property>    <property>        <name>ha.zookeeper.quorum</name>        <value>202.96.64.8:2181,202.96.64.10:2181,202.96.64.12:2181</value>    </property>    <property>      <name>hadoop.tmp.dir</name>      <value>/opt/hadoop2</value>    </property>    <property>      <name>hadoop.proxyuser.hadoop.hosts</name>      <value>*</value>    </property>    <property>      <name>hadoop.proxyuser.hadoop.groups</name>      <value>*</value>    </property></configuration>

4.4 yarn-site.xml配置

<?xml version="1.0"?><!--  Licensed under the Apache License, Version 2.0 (the "License");  you may not use this file except in compliance with the License.  You may obtain a copy of the License at    http://www.apache.org/licenses/LICENSE-2.0  Unless required by applicable law or agreed to in writing, software  distributed under the License is distributed on an "AS IS" BASIS,  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and  limitations under the License. See accompanying LICENSE file.--><configuration><!-- Site specific YARN configuration properties --><configuration>    <property>            <name>yarn.resourcemanager.hostname</name>             <value>202.96.64.8</value>    </property>    <property>            <name>yarn.nodemanager.aux-services</name>             <value>mapreduce_shuffle</value>    </property>    <property>            <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>            <value>org.apache.hadoop.mapred.ShuffleHandler</value>    </property></configuration></configuration>

4.5 slaves配置

202.96.64.10202.96.64.12202.96.64.14

4.6 ZooKeeper配置

# The number of milliseconds of each ticktickTime=2000# The number of ticks that the initial # synchronization phase can takeinitLimit=10# The number of ticks that can pass between # sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.dataDir=/opt/zookeeper# the port at which the clients will connectclientPort=2181server.1=node1:2888:3888server.2=node2:2888:3888server.3=node3:2888:3888

注意:请自己新建/opt/zookeeper目录,并且新建文件/opt/zookeeper/myid,文件写入zk的数字

5.初始化集群

1.启动三个JournalNode:node1# hadoop-daemon.sh start journalnodenode2# hadoop-daemon.sh start journalnodenode3# hadoop-daemon.sh start journalnode出现JournalNode表示启动成功!2.在其中一个namenode(例如node1)上格式化:node1# hdfs namenode -formatnode1# hadoop-daemon.sh start namenode(cd ../logs    tail -n50 hadoop-root-namenode-node1.log  查看详细日志是否启动成功,有无错误。)启动namenode2(切换到node2主机,注意:必须有一台namenode主机先启动):node2# hdfs namenode -bootstrapStandby验证:切换到上面设置的系统目录那/opt/hadoop2看是否有文件存在。3.在其中一个namenode上格式化zkfc:/root/hadoop-2.7.4/binnode1# hdfs zkfc -formatZK4.停止上面节点:/root/hadoop-2.7.4/sbinnode1# stop-dfs.sh

6.再次启动集群

node1# zkServer.sh startnode2# zkServer.sh startnode3# zkServer.sh startnode1# start-all.sh

7.访问

http://202.96.64.8:50070/

8.配置中额外需要注意的问题

8.1 请配置/etc/hosts将本机回路127.0.0.1注释

node1# vim /etc/hosts
# 127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4::1         localhost localhost.localdomain localhost6 localhost6.localdomain6# 127.0.0.1 node1202.96.64.8 node1202.96.64.10 node2202.96.64.12 node3202.96.64.14 node4202.96.64.16 node5

8.2请将防火墙及selinux.