centos+hadoop2.5.1+hbase0.98集群环境搭建

来源:互联网 发布:pkpm建筑结构设计软件 编辑:程序博客网 时间:2024/06/05 14:07

一、结构图

主机名

进程

Name Node

Resource Manager

Node Manager

Data Node

hadoop0

Y

Y

N

N

hadoop1

N

N

Y

Y

hadoop2

N

N

Y

Y

 

主机名

进程

Master

Zookeeper

Region Server

hadoop0

Y

Y

N

hadoop1

backup

Y

Y

hadoop2

N

Y

Y

 

主机名

IP地址

用户/密码

hadoop0

192.168.56.101

hadoop/hadoop

hadoop1

192.168.56.102

hadoop/hadoop

hadoop2

192.168.56.103

hadoop/hadoop

二、实验环境:

Windows7 64位操作系统+virtualbox虚拟机环境+3台centos6.5 64位虚拟机+jdk-7u67-linux-x64.gz+hadoop-2.5.1.tar.gz+hbase-0.98.6.1-hadoop2-bin.tar.gz

注:实验环境很是重要,一定注意版本号的对应,当然不排除有一定的兼容性,但也应该注意这个问题。


三、实验准备:

①安照表格所示进行设置三台Centos主机名、IP地址和hadoop用户及其密码;

并在三台主机上修改/etc/hosts,修改为如下所示:

[hadoop@hadoop0 .ssh]$ cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.56.101 hadoop0

192.168.56.102 hadoop1

192.168.56.103 hadoop2

注:为防止iptablesselinuxe的过滤作用,将它们关闭。

service iptables stop;setenforce 0

②配置三台主机在不用输入密码的情况下可以互相SSH登录:

以在hadoop0上为例:

[hadoop@hadoop0 ~]$ service sshd start

[hadoop@hadoop0 ~]$ ssh-keygen(然后全部默认,直接敲击回车即可。)

即可在~/.ssh中生成一对公钥和私钥:

[hadoop@hadoop0 .ssh]$ ls

id_rsa  id_rsa.pub

并把公钥放到authorized_keys文件(权限为644)中:

[hadoop@hadoop0 .ssh]$ cat id_rsa.pub >>authorized_keys

[hadoop@hadoop0 .ssh]$ chmod 644 authorized_keys

类似在hadoop1hadoop2上生成公钥和私钥,一定要在hadoop用户中生成,并都放入到一个authorized_keys文件中,再放到三台主机中:

[hadoop@hadoop0 .ssh]$ ls

authorized_keys  hadoop1_pub_key  hadoop2_pub_key  id_rsa  id_rsa.pub

[hadoop@hadoop0 .ssh]$ cat hadoop1_pub_key >>authorized_keys 

[hadoop@hadoop0 .ssh]$ cat hadoop2_pub_key >>authorized_keys 

[hadoop@hadoop0 .ssh]$ scp authorized_keys hadoop@hadoop1:/home/hadoop/.ssh/

The authenticity of host 'hadoop1 (192.168.56.102)' can't be established.

RSA key fingerprint is 4c:a7:c7:70:a1:d5:c4:be:76:4d:f8:33:5b:99:7f:ac.

Are you sure you want to continue connecting (yes/no)? yes

Warning: Permanently added 'hadoop1,192.168.56.102' (RSA) to the list of known hosts.

hadoop@hadoop1's password: 

authorized_keys                      100% 1188     1.2KB/s   00:00 

[hadoop@hadoop0 .ssh]$ scp authorized_keys hadoop@hadoop2:/home/hadoop/.ssh/

用ssh命令登录hadoop1测试:

[hadoop@hadoop0 .ssh]$ ssh hadoop1

Last login: Thu Oct 23 20:28:57 2014 from hadoop0

[hadoop@hadoop1 ~]$ 

无密码登录成功,证明成功。

③在三台主机中配置相同版本的jdk环境,顺便将要用到的软件包也都放入主机中:

还是在hadoop0中为例,但需要在三台主机上做同样动作:

若原来环境中有java,可以用:yum erase java命令删除;

jdk7u67-linux-x64.gzhadoop-2.5.1.tar.gzhbase-0.98.6.1-hadoop2-bin.tar.gz放入/usr/local/目录下,并解压,再用软连接javahadoophbase指向解压后的包:

lrwxrwxrwx. 1 root   root          12 10月 23 20:48 hadoop -> hadoop-2.5.1

drwxr-xr-x. 9 hadoop hadoop      4096 10月 23 20:44 hadoop-2.5.1

-rwxr-x---. 1 hadoop hadoop 148199785 10月 23 20:38 hadoop-2.5.1.tar.gz

lrwxrwxrwx. 1 root   root          22 10月 23 20:49 hbase -> hbase-0.98.6.1-hadoop2

drwxr-xr-x. 7 hadoop hadoop      4096 10月 23 20:44 hbase-0.98.6.1-hadoop2

-rwxr-x---. 1 hadoop hadoop  82107040 10月 23 20:38 hbase-0.98.6.1-hadoop2-bin.tar.gz

lrwxrwxrwx. 1 root   root          11 10月 23 20:52 java -> jdk1.7.0_67

drwxr-xr-x. 8 uucp      143      4096 7月  26 00:51 jdk1.7.0_67

/etc/profile文件末尾处配置java环境变量:

export JAVA_HOME=/usr/local/java

export CLASSPATH=.:$JAVA_HOME/lib/tools.jar:$JAVA_HOME/lib/dt.jar

export PATH=$PATH:$JAVA_HOME/bin

加载环境变量:$source /etc/profile

测试:# java -version

java version "1.7.0_67"

Java(TM) SE Runtime Environment (build 1.7.0_67-b01)

Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

证明成功。

四、配置、启动并验证hadoop

1)、在三台主机的hadoop用户身份下配置hadoop的配置文件(hadoop的全部配置文件都在/usr/local/hadoop/etc/hadoop/目录下,因为我将hadoop压缩包解压在/usr/local目录下,所以都在该路径下):

在hadoop-env.sh中:

export JAVA_HOME=/usr/local/java

yarn-env.sh中:

export JAVA_HOME=/usr/local/java

保证slaves文件中只有:

hadoop1

hadoop2

在core-site.xml中:

<configuration>

        <property>

                <name>hadoop.tmp.dir</name>

                <value>/home/hadoop/tmp</value>

        </property>

        <property>

                <name>fs.defaultFS</name>

                <value>hdfs://hadoop0:9000</value>

        </property>

        <property>

                <name>io.file.buffer.size</name>

                <value>4096</value>

        </property>

</configuration>

在hdfs-site.xml中:

<configuration>

        <property>

                <name>dfs.nameservices</name>

                <value>hadoop-cluster1</value>

        </property>

        <property>

                <name>dfs.namenode.secondary.http-address</name>

                <value>hadoop0:50090</value>

        </property>

        <property>

                <name>dfs.namenode.name.dir</name>

                <value>file:///home/hadoop/dfs/name</value>

        </property>

        <property>

                <name>dfs.datanode.data.dir</name>

                <value>file:///home/hadoop/dfs/data</value>

        </property>

        <property>

                 <name>dfs.replication</name>

                 <value>2</value>

         </property>

         <property>

                 <name>dfs.webhdfs.enabled</name>

                 <value>true</value>

         </property>

</configuration>

在mapred-site.xml中:

<configuration>

        <property>

                <name>mapreduce.framework.name</name>

                 <value>yarn</value>       

        </property>                    

         <property>                       

                <name>mapreduce.jobtracker.http.address</name>                        

                <value>hadoop0:50030</value>                                 

         </property>                                           

         <property>                                                                <name>mapreduce.jobhistory.address</name>                                                 <value>hadoop0:10020</value>                                             

         </property>                                                                               <property>                                                                                        <name>mapreduce.jobhistory.webapp.address</name>                                          <value>hadoop0:19888</value>                                                       </property>

</configuration>

在yarn-site.xml中:

<configuration>

        <property>

                <name>yarn.nodemanager.aux-services</name>

                <value>mapreduce_shuffle</value>

        </property>      

        <property>          

                <name>yarn.resourcemanager.address</name>         

                <value>hadoop0:8032</value>                 

         </property>                             

         <property>                                                    <name>yarn.resourcemanager.scheduler.address</name>                    

             <value>hadoop0:8030</value>                                         

          </property>                                                    

          <property>                                                                           <name>yarn.resourcemanager.resource-tracker.address</name>                                <value>hadoop0:8031</value>                                                        </property>                                                                              <property>                                                                                       <name>yarn.resourcemanager.admin.address</name>                                           <value>hadoop0:8033</value>                                                          </property>                                                                                <property>                                                                                     <name>yarn.resourcemanager.webapp.address</name>                                            <value>hadoop0:8088</value>                     

         </property>  

</configuration>

(2)在hadoop0中启动hadoop

hadoop0主机中,用hadoop的身份,在/usr/local/hadoop/目录下执行命令:

bin/hdfs namenode -format   (用于格式化文件系统)

 

格式化成功。

sbin/start-all.sh    (启动hadoop)

 

启动时的状态。

 

(3)测试

hadoop0的状态:

 

hadoop1的状态:

 

hadoop2的状态:

 

web页面访问:http://192.168.56.101:8088,看到有两个nodemanager活动。

 

web页面访问:http://192.168.56.101:50070,看到有两个datanode活动。

 

证明访问成功。

五、配置、启动并验证hbasehadoop上运行。

(1)在三台主机中的/usr/local/hbase/conf中配置hbase的配置文件:(在前面我已经将hbase压缩包解压在/usr/local/下)

hbase-env.sh中:

export JAVA_HOME=/usr/local/java

        hbase-site.xml中:

<configuration>

        <property>

                <name>hbase.cluster.distributed</name>

                <value>true</value>

        </property>

        <property>

                <name>hbase.rootdir</name>

                <value>hdfs://hadoop0:9000/hbase</value>

         </property>

         <property>

                <name>hbase.zookeeper.property.dataDir</name>

                <value>/home/hadoop/zookeeper</value>

         </property>

         <property>

                <name>hbase.zookeeper.quorum</name>

                <value>hadoop0,hadoop1,hadoop2</value>

             </property>

             <property>

               <name>hbase.zookeeper.property.dataDir</name>

                 <value>/home/hadoop/zookeeper</value>

         </property>

</configuration>

保证regionservers中只有:

hadoop1

hadoop2

创建backup-masters文件,并在其中填入:hadoop1

(2)在hadoop0中启动hbase

 

上图为启动时的状态。

(3)验证

hadoop0的启动后状态为:

 

hadoop1的启动后的状态为:

 

hadoop2的启动后的状态为:

 

查看web页面:http://192.168.56.101:101:60010,显示master的运行状态:

 

查看web地址:http://192.168.56.103:60030,显示RegionServer的运行状态:

 

 

hadoop的目录下运行命令,查看在hdfs文件系统的中hbase数据库文件:

 

 


0 0
原创粉丝点击