CentOS 64位安装配置Hadoop-2.6.0

来源:互联网 发布:app软件制作 编辑:程序博客网 时间:2024/05/18 03:19

[root@master ~]# uname -a

Linux master.hadoop 2.6.32-431.el6.x86_64 #1 SMP Fri Nov22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

[root@master ~]# cat /etc/issue

CentOS release 6.5 (Final)

三台主机:

主机名

主机短名

IP

master.hadoop

master

192.168.56.102

slave1.hadoop

slave1

192.168.56.103

slave2.hadoop

slave2

192.168.56.104

 

 

安装JDK:

下载JDK地址

http://download.oracle.com/otn-pub/java/jdk/8u45-b14/jdk-8u45-linux-x64.tar.gz

解压到指定目录

tar -zxvf jdk-8u45-linux-x64.tar.gz -C /opt

 

用Hadoop源码编译64位版本,如何编译见上一篇文章,编译好的版本放在了百度云上:

http://pan.baidu.com/s/1qWFSigk

 

解压到指定目录:

tar -zxvf hadoop-2.6.0-x64.tar.gz -C /opt

 

 

2.主机之间互信

用ssh-keygen -t rsa生成公钥密钥对,生成的密钥在~/.ssh目录下:

私钥文件:id_rsa

公钥文件:id_rsa.pub

将三台主机的公钥文件id_rsa.pub内容放到~/.ssh/authorized_keys文件,在一台机器上做就可以了:

ssh-keygen -t rsa

cat .ssh/id_rsa.pub > .ssh/authorized_keys

 

ssh 192.168.56.103 ssh-keygen -t rsa

ssh 192.168.56.103 cat ~/.ssh/id_rsa.pub >>.ssh/authorized_keys 

 

ssh 192.168.56.104 ssh-keygen -t rsa

ssh 192.168.56.104 cat ~/.ssh/id_rsa.pub >>.ssh/authorized_keys 

 

scp .ssh/authorized_keys 192.168.56.103:~/.ssh

scp .ssh/authorized_keys 192.168.56.104:~/.ssh

 

在每台机器都执行下面命令:

ssh 192.168.56.102 date

ssh 192.168.56.103 date

ssh 192.168.56.104 date

如直接返回日期,互信建立成功

 

3.Hadoop配置

配置环境变量:

vi /etc/profile加入如下配置:

export JAVA_HOME= /opt/jdk1.8.0_45

export HADOOP_HOME=/opt/hadoop-2.6.0

exportCLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:

exportPATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:

使之生效:

source /etc/profile

 

vi /etc/hosts文件 在最后面添加如下内容: 

192.168.56.102 master master.hadoop

192.168.56.103 slave1 slave1.hadoop

192.168.56.104 slave2 slave2.hadoop

 

进入hadoop配置文件目录

cd /opt/hadoop-2.6.0/etc/hadoop/

 

在hadoop-env.sh和 yarn-env.sh的开头添加如下环境变量(一定要添加) 

export JAVA_HOME=/opt/jdk1.8.0_45

 

vi core-site.xml

<configuration>

         <property>

               <name>hadoop.tmp.dir</name>

                <value>/apt/hadoop/tmp</value>

               <description>Abase for other temporarydirectories.</description>

       </property>

       <property>

               <name>fs.defaultFS</name>

               <value>hdfs://master.hadoop:9000</value>

        </property>

       <property>

               <name>io.file.buffer.size</name>

               <value>4096</value>

       </property>

</configuration>

 

vi hdfs-site.xml

<configuration>

   <property>

         <name>dfs.namenode.name.dir</name>

         <value>file:///opt/hadoop/dfs/name</value>

   </property>

       <property>

        <name>dfs.datanode.data.dir</name>

        <value>file:///opt/hadoop/dfs/data</value>

   </property>

   <property>

        <name>dfs.replication</name>

        <value>2</value>

   </property>

   <property>

       <name>dfs.nameservices</name>

       <value>hadoop-cluster1</value>

   </property>

   <property>

       <name>dfs.namenode.secondary.http-address</name>

       <value>master.hadoop:50090</value>

   </property>

    <property>

       <name>dfs.webhdfs.enabled</name>

       <value>true</value>

   </property>

</configuration>

 

cp mapred-site.xml.template mapred-site.xml

vi mapred-site.xml 

 

<configuration>

   <property>

       <name>mapreduce.framework.name</name>

       <value>yarn</value>

       <final>true</final>

   </property>

 

   <property>

       <name>mapreduce.jobtracker.http.address</name>

       <value>master.hadoop:50030</value>

   </property>

   <property>

       <name>mapreduce.jobhistory.address</name>

       <value>master.hadoop:10020</value>

   </property>

   <property>

       <name>mapreduce.jobhistory.webapp.address</name>

       <value>master.hadoop:19888</value>

   </property>

   <property>

       <name>mapred.job.tracker</name>

        <value>http://master.hadoop:9001</value>

   </property>

</configuration>

 

vi yarn-site.xml 

<configuration>

 

<!-- Site specific YARN configuration properties-->

   <property>

       <name>yarn.resourcemanager.hostname</name>

       <value>master.hadoop</value>

    </property>

 

   <property>

       <name>yarn.nodemanager.aux-services</name>

       <value>mapreduce_shuffle</value>

   </property>

   <property>

       <name>yarn.resourcemanager.address</name>

       <value>master.hadoop:8032</value>

   </property>

   <property>

       <name>yarn.resourcemanager.scheduler.address</name>

        <value>master.hadoop:8030</value>

   </property>

   <property>

       <name>yarn.resourcemanager.resource-tracker.address</name>

       <value>master.hadoop:8031</value>

   </property>

   <property>

       <name>yarn.resourcemanager.admin.address</name>

       <value>master.hadoop:8033</value>

   </property>

   <property>

       <name>yarn.resourcemanager.webapp.address</name>

       <value>master.hadoop:8088</value>

   </property>

</configuration>

 

vi slaves

slave1.hadoop

slave2.hadoop

 

在其他主机上进行同样的设置

 

4.将三台机器的防火墙关闭掉:

service iptables stop

chkconfig iptables off

到此:整个Hadoop的集群算是配置完成了,在master.hadoop的机器上面执行命令启动集群

hadoop namenode -format

/opt/hadoop-2.6.0/sbin/start-all.sh

 

5.验证

#Master上面的如下:

[root@master ~]# jps
8464 Jps
7189 ResourceManager
6959 NameNode

 

slave1和slave2上面的如下:

[root@slave1 ~]# jps
3765 Jps
3062 DataNode
3210 NodeManager

 

http://192.168.56.102:8088/cluster/nodes

1 0