hadoop2.2.0安装

来源:互联网 发布:手机浏览器启动淘宝 编辑:程序博客网 时间:2024/05/22 14:47

Hadoop我们从Apache官方网站直接下载最新版本Hadoop2.2。官方目前是提供了linux32位系统可执行文件,下载地址:http://apache.claz.org/hadoop/common/hadoop-2.2.0/,因为操作系统我选择的是CentOS-6.5-x86_64-bin-DVD1.iso所以需要下载hadoop的源码自己编译成64位的hadoop,至于编译过程本人还不太了解,因此这里我选择了在晚上找到的别人编译好的64位hadoop-2.4.1。

1、这里我搭建了一个由三台机器组成的集群:

IP     用户名          密码             角色

172.16.254.222             root           123456            master

172.16.254.228             root           123456            slave

172.16.254.229             root           123456            slave

2、配置三台主机的host映射(三台主机都要执行):

vi /etc/hosts

172.16.254.222 hadoop1

172.16.254.228 hadoop2

172.16.254.229 hadoop3

vi /etc/sysconfig/network

NETWORKING=yes

HOSTNAME=hadoop1

重启网卡:service network restart

vi /etc/hosts

172.16.254.222 hadoop1

172.16.254.228 hadoop2

172.16.254.229 hadoop3

vi /etc/sysconfig/network

NETWORKING=yes

HOSTNAME=hadoop2

重启网卡:service network restart

vi /etc/hosts

172.16.254.222 hadoop1

172.16.254.228 hadoop2

172.16.254.229 hadoop3

vi /etc/sysconfig/network

NETWORKING=yes

HOSTNAME=hadoop3

重启网卡:service network restart

3、关闭防火墙(在三台主机上分别执行)

service iptables stop

chkconfig iptables off 

4、打通三台主机之间的SSH免密码登陆:

首先到用户主目录下, 在“ls-a”命令显示的文件中,最后一列中间一项是“.ssh”,该文件夹是存放密钥的。如果没有会自动生成。

(1)在hadoop1上执行:

ssh-keygen -t rsa

之后一路回车即可

cd .ssh/

ls -l

cp id_rsa.pub authorized_keys

ls -l

ssh hadoop1

exit

(2)在hadoop2上执行:

ssh -keygen -t rsa

之后一路回车即可

cd .ssh/

ls -l

cp id_rsa.pub authorized_keys

ls -l

ssh hadoop2

exit

(3)在hadoop3上执行:

ssh -keygen -t rsa

之后一路回车即可

cd .ssh/

ls -l

cp id_rsa.pub authorized_keys

ls -l

ssh hadoop3

exit

(4)设置彼此之间的免密码登录

在hadoop1上执行:ssh-copy-id -i hadoop2

在hadoop3上执行:ssh-copy-id -i hadoop2

在hadoop2上执行:

scp /root/.ssh/authorized_keys hadoop1:/root/.ssh/

scp /root/.ssh/authorized_keys hadoop2:/root/.ssh/

5、安装jdk(在三台主机上都要执行)

(1)卸载操作系统自带的jdk

查看Linux自带的JDK是否已安装:java -version

查看jdk的信息:rpm -qa|grep java

一般将获得如下信息:      
      java-1.4.2-gcj-compat-1.4.2.0-40jpp.115
      java-1.6.0-openjdk-1.6.0.0-1.7.b09.el5

卸载自带的jdk:

yum -y remove java java-1.4.2-gcj-compat-1.4.2.0-40jpp.115        

yum -y remove  java-1.6.0-openjdk-1.6.0.0-1.7.b09.el5
(2)安装jdk(安装包位于/usr/local目录下)

tar -zxvf jdk-8u40-linux-x64.tar.gz

(3)重命名解压后的文件为jdk

(4)配置环境变量

vi /etc/profile

export JAVA_HOME=/usr/local/jdk

export PATH=.:$PATH:$JAVA_HOME/bin

source /etc/profile

6、将编译好的hadoop文件上传到hadoop1的/usr/local/目录下安装hadoop

(1)重命名上传的文件名为hadoop

(2)进入/usr/local/hadoop/etc/hadoop/目录下编辑如下文件

vi hadoop-env.sh

export JAVA_HOME=/usr/local/jdk


vi yarn-env.sh

export JAVA_HOME=/usr/local/jdk


vi slaves

hadoop2

hadoop3


vi core-site.xml

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/tmp</value>
<description>Abase for othertemporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.hduser.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hduser.groups</name>
<value>*</value>
</property>
</configuration>


vi hdfs-site.xml

<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop1:9001</value>
    </property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hduser/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hduser/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>


vi mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop1:19888</value>
</property>
</configuration>


vi yarn-site.xml

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop1:8088</value>
</property>
</configuration>

(3)把如上修改后的hadoop文件复制到hadoop2 hadoop3上:

scp -r /usr/local/hadoop hadoop2:/usr/local

scp -r /usr/local/hadoop hadoop3:/usr/local

(4)格式化hadoop集群,在hadoop1上

cd /usr/local/hadoop/bin

hadoop namenode -format 

(5)启动hadoop集群:

在hadoop1中

cd /usr/local/hadoop/sbin

start-all.sh

7、查看hadoop集群是否正常启动

在hadoop1上使用jps命令查看有如下进程:

namenode secondarynamenode resourcemanager

在hadoop2 hadoop3上使用jps命令查看那有如下进程:

datanode nodemanaget

查看HDFS: http://172.16.254.222:50070

查看RM: http://172.16.254.222:8088

至此整个hadoop集群按照完毕。。。

0 0