hadoop2.2.0+zookeeper+高可用+完全分布式

来源:互联网 发布:python算法书籍 编辑:程序博客网 时间:2024/06/11 23:55

幽灵工作室提供

问题去新浪博客留言:http://weibo.com/youlingR

1.节点准备

三个节点:

master 192.168.1.150

namenode,resourcemanager,datanode,nodemanager,zookeeper,journalnode,dfszkfailovercontroller

slave1 192.168.1.151

namenode,datanode,nodemanager,zookeeper,journalnode,dfszkfailovercontroller

slave2 192.168.1.152

datanode,nodemanager,zookeeper,journalnode

 

2.基本配置

主机名和ip地址的配置

 

安装jdk配置环境变量 

export JAVA_HOME=/usr/java/jdk1.7.0

export PATH=$PATH:$JAVA_HOME/bin

source /etc/profile

配置ssh免密码连接

一共三块:

ResourceManager 启动NodeManager的配置SSH

NameNode 启动 DataNode的配置 SSH

NameNode之间的启动配置 SSH

 

ResourceManager 启动NodeManager的配置SSH

NameNode 启动 DataNode的配置 SSH

cd $HOME/.ssh/

ssh-keygen -t rsa  //生成密钥

ssh-copy-id -i ~/.ssh/id_rsa.pub master

ssh-copy-id -i ~/.ssh/id_rsa.pub slave1 #拷贝密钥

ssh-copy-id -i ~/.ssh/id_rsa.pub slave2

ssh hadoop@slave1 #测试

 

NameNode之间的启动配置 SSH

cd $HOME/.ssh/

ssh-keygen -t rsa  //生成密钥

ssh-copy-id -i ~/.ssh/id_rsa.pub master

 

 

配置host文件

sudo vim /etc/hosts(需要给hadoop添加sudo的权限)

192.168.1.150   master

192.168.1.151   slave1 

192.168.1.152   slave2

scp /etc/hosts root@slave1:/etc/hosts

 

3.zk安装

基本安装:

wget http://apache.claz.org/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz  #下载zk

tar -zxvf zookeeper-3.4.5.tar.gz -C /cloud/

sudo midir /cloud/

sudo chown hadoop:hadoop /cloud/  

zk配置:

cp zoo_sample.cfg zoo.cfg

 

# The number of milliseconds of each tick

tickTime=2000

# The number of ticks that the initial 

# synchronization phase can take

initLimit=10

# The number of ticks that can pass between 

# sending a request and getting an acknowledgement

syncLimit=5

# the directory where the snapshot is stored.

# do not use /tmp for storage, /tmp here is just 

# example sakes.

dataDir=/cloud/zookeeper-3.4.5/data

# the port at which the clients will connect

clientPort=2181

server.1=master:2888:3888

server.2=slave1:2888:3888

server.3=slave2:2888:3888

拷贝

scp -r /cloud/zookeeper-3.4.5/ hadoop@slave1:/cloud/

scp -r /cloud/zookeeper-3.4.5/ hadoop@slave2:/cloud/

设置

配置zkid

echo "1" > /cloud/zookeeper-3.4.5/myid 

echo "2" > /cloud/zookeeper-3.4.5/myid 

echo "3" > /cloud/zookeeper-3.4.5/myid 

启动

ZK_HOME=/cloud/zookeeper-3.4.5/

$ZK_HOME/bin/zkServer.sh start #每个节点启动

jps  查看进程,每个节点QuorumPeerMain

bin/zkServer.sh status  #一个header  多个follower

测试Zookeeper是否安装成功

[hadoop@master zookeeper-3.4.5]$ bin/zkCli.sh 

Connecting to localhost:2181

2014-06-10 01:37:16,763 [myid:] - INFO  [main:Environment@100] - Client environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT

2014-06-10 01:37:16,774 [myid:] - INFO  [main:Environment@100] - Client environment:host.name=master

2014-06-10 01:37:16,775 [myid:] - INFO  [main:Environment@100] - Client environment:java.version=1.7.0

2014-06-10 01:37:16,775 [myid:] - INFO  [main:Environment@100] - Client environment:java.vendor=Oracle Corporation

2014-06-10 01:37:16,776 [myid:] - INFO  [main:Environment@100] - Client environment:java.home=/usr/java/jdk1.7.0/jre

2014-06-10 01:37:16,776 [myid:] - INFO  [main:Environment@100] - Client environment:java.class.path=/cloud/zookeeper-3.4.5/bin/../build/classes:/cloud/zookeeper-3.4.5/bin/../build/lib/*.jar:/cloud/zookeeper-3.4.5/bin/../lib/slf4j-log4j12-1.6.1.jar:/cloud/zookeeper-3.4.5/bin/../lib/slf4j-api-1.6.1.jar:/cloud/zookeeper-3.4.5/bin/../lib/netty-3.2.2.Final.jar:/cloud/zookeeper-3.4.5/bin/../lib/log4j-1.2.15.jar:/cloud/zookeeper-3.4.5/bin/../lib/jline-0.9.94.jar:/cloud/zookeeper-3.4.5/bin/../zookeeper-3.4.5.jar:/cloud/zookeeper-3.4.5/bin/../src/java/lib/*.jar:/cloud/zookeeper-3.4.5/bin/../conf:

2014-06-10 01:37:16,781 [myid:] - INFO  [main:Environment@100] - Client environment:java.library.path=/usr/java/packages/lib/i386:/lib:/usr/lib

2014-06-10 01:37:16,782 [myid:] - INFO  [main:Environment@100] - Client environment:java.io.tmpdir=/tmp

2014-06-10 01:37:16,782 [myid:] - INFO  [main:Environment@100] - Client environment:java.compiler=<NA>

2014-06-10 01:37:16,782 [myid:] - INFO  [main:Environment@100] - Client environment:os.name=Linux

2014-06-10 01:37:16,784 [myid:] - INFO  [main:Environment@100] - Client environment:os.arch=i386

2014-06-10 01:37:16,784 [myid:] - INFO  [main:Environment@100] - Client environment:os.version=2.6.32-358.el6.i686

2014-06-10 01:37:16,785 [myid:] - INFO  [main:Environment@100] - Client environment:user.name=hadoop

2014-06-10 01:37:16,785 [myid:] - INFO  [main:Environment@100] - Client environment:user.home=/home/hadoop

2014-06-10 01:37:16,786 [myid:] - INFO  [main:Environment@100] - Client environment:user.dir=/cloud/zookeeper-3.4.5

2014-06-10 01:37:16,791 [myid:] - INFO  [main:ZooKeeper@438] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@149f041

Welcome to ZooKeeper!

2014-06-10 01:37:16,898 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@966] - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)

2014-06-10 01:37:16,911 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@849] - Socket connection established to localhost/127.0.0.1:2181, initiating session

JLine support is enabled

2014-06-10 01:37:16,992 [myid:] - INFO  [main-SendThread(localhost:2181):ClientCnxn$SendThread@1207] - Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x14684e16e110000, negotiated timeout = 30000

 

WATCHER::

 

WatchedEvent state:SyncConnected type:None path:null

[zk: localhost:2181(CONNECTED) 0] ls /

[zookeeper]

[zk: localhost:2181(CONNECTED) 1] create /rolin rolin 

Created /rolin

[zk: localhost:2181(CONNECTED) 2] ls

ZooKeeper -server host:port cmd args

connect host:port

get path [watch]

ls path [watch]

set path data [version]

rmr path

delquota [-n|-b] path

quit 

printwatches on|off

create [-s] [-e] path data acl

stat path [watch]

close 

ls2 path [watch]

history 

listquota path

setAcl path acl

getAcl path

sync path

redo cmdno

addauth scheme auth

delete path [version]

setquota -n|-b val path

[zk: localhost:2181(CONNECTED) 3] ls /

[rolin, zookeeper]

[zk: localhost:2181(CONNECTED) 4] get /rolin

rolin

cZxid = 0x100000002

ctime = Tue Jun 10 01:37:50 PDT 2014

mZxid = 0x100000002

mtime = Tue Jun 10 01:37:50 PDT 2014

pZxid = 0x100000002

cversion = 0

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 5

numChildren = 0

[zk: localhost:2181(CONNECTED) 5] set /rolin my name is rolin

Command failed: java.lang.NumberFormatException: For input string: "name"

[zk: localhost:2181(CONNECTED) 6] set /rolin youling         

cZxid = 0x100000002

ctime = Tue Jun 10 01:37:50 PDT 2014

mZxid = 0x100000003

mtime = Tue Jun 10 01:38:45 PDT 2014

pZxid = 0x100000002

cversion = 0

dataVersion = 1

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 7

numChildren = 0

[zk: localhost:2181(CONNECTED) 7] get /rolin

youling

cZxid = 0x100000002

ctime = Tue Jun 10 01:37:50 PDT 2014

mZxid = 0x100000003

mtime = Tue Jun 10 01:38:45 PDT 2014

pZxid = 0x100000002

cversion = 0

dataVersion = 1

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 7

numChildren = 0

[zk: localhost:2181(CONNECTED) 8] delete /rolin

[zk: localhost:2181(CONNECTED) 9] ls /

[zookeeper]

[zk: localhost:2181(CONNECTED) 10] quit

Quitting...

2014-06-10 01:39:11,351 [myid:] - INFO  [main:ZooKeeper@684] - Session: 0x14684e16e110000 closed

2014-06-10 01:39:11,351 [myid:] - INFO  [main-EventThread:ClientCnxn$EventThread@509] - EventThread shut down

 

 

4.hadoop安装

4.1解压hadoop/cloud

tar -zxvf ~/Downloads/hadoop-2.2.0.tar.gz -C /cloud/

4.2配置hadoop配置文件

        这里要修改配置文件一共包括 6 个,分别是 在hadoop-env.sh 、 core-site.xml 、 hdfs-site.xml 、 mapred-site.xml 、  yarn-site.xml 和 slaves 。

        修改文件的目录地址:/home/tom/yarn/hadoop-2.2.0/etc/hadoop/

4.2.1文件 hadoop-env.sh

        添加jdk 环境变量:

        export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_45

4.2.2 文件 coer-site.xml

        <configuration>     

<property>     

<name>fs.defaultFS</name>     

<value>hdfs://cluster1</value>     

</property>

 

【这里的值指的是默认的 HDFS 路径 。这里只有一个HDFS 集群,在这里指定!该值来自于 hdfs-site.xml 中的配置】     

<property>     

<name>hadoop.tmp.dir</name>     

<value>/cloud/hadoop-2.2.0/data</value>    

</property>

 

【这里的路径默认是 NameNode 、 DataNode 、 JournalNode 等存放数据的公共目录。用户也可以自己单独指定这三类节点的目录。 这里的yarn_data/tmp 目录与文件都是自己创建的 】     

<property>     

<name>ha.zookeeper.quorum</name>     

<value>master:2181,slave1:2181,slave2:2181</value>    

</property>

【这里是 ZooKeeper 集群的地址和端口。注意,数量一定是奇数,且不少于三个节点】     

</configuration>

4.2.3 文件 hdfs-site.xml

       重点核心文件:

         <configuration>

<property>

<name>dfs.nameservices</name>

<value>cluster1</value>

</property>

<property>

<name>dfs.ha.namenodes.cluster1</name>

<value>nn1,nn2</value>

</property>

<!-- nn1的RPC通信地址 -->

<property>

<name>dfs.namenode.rpc-address.cluster1.nn1</name>

<value>master1:9000</value>

</property>

<!-- nn1的http通信地址 -->

<property>

<name>dfs.namenode.http-address.cluster1.nn1</name>

<value>master1:50070</value>

</property>

<!-- nn2的RPC通信地址 -->

<property>

<name>dfs.namenode.rpc-address.cluster1.nn2</name>

<value>slave1:9000</value>

</property>

<!-- nn2的http通信地址 -->

<property>

<name>dfs.namenode.http-address.cluster1.nn2</name>

<value>slave1:50070</value>

</property>

<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->

<property>

<name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://master1:8485;slave1:8485;slave2:8485/cluster1</value>

</property>

<!-- 指定JournalNode在本地磁盘存放数据的位置 -->

<property>

<name>dfs.journalnode.edits.dir</name>

<value>/cloud/hadoop-2.2.0/journal</value>

</property>

<!-- 开启NameNode失败自动切换 -->

<property>

<name>dfs.ha.automatic-failover.enabled</name>

<value>true</value>

</property>

<!-- 配置失败自动切换实现方式 -->

<property>

<name>dfs.client.failover.proxy.provider.cluster1</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->

<property>

<name>dfs.ha.fencing.methods</name>

<value>

sshfence

shell(/bin/true)

</value>

</property>

<!-- 使用sshfence隔离机制时需要ssh免登陆 -->

<property>

<name>dfs.ha.fencing.ssh.private-key-files</name>

<value>/home/hadoop/.ssh/id_rsa</value>

</property>

<!-- 配置sshfence隔离机制超时时间 -->

<property>

<name>dfs.ha.fencing.ssh.connect-timeout</name>

<value>30000</value>

</property>

</configuration>

2.4 文件 mapred-site.xml

    <configuration>

       <property>

           <name>mapreduce.framework.name</name>

           <value>yarn</value>

      </property>

     </configuration>

    【指定运行 mapreduce 的环境是 yarn ,与 hadoop1 不同的地方】

2.5 文件 yarn-site.xml

      <configuration>

<!-- 指定resourcemanager地址 -->

<property>

<name>yarn.resourcemanager.hostname</name>

<value>master</value>

</property>

<!-- 指定nodemanager启动时加载server的方式为shuffle server -->

<property>

<name>yarn.nodemanager.aux-services</name>

<value>mapreduce_shuffle</value>

</property>

</configuration>

 

2.6 文件 slaves

      添加:这里指定哪台机器是 datanode ,这里指定 4 台机器。你甚至可以把集群所有机器都当做 datanode

        master

slave1

slave2

 

 

4.3配置/etc/profile文件

配置一下hadoop-home

 

4.4拷贝hadoop到各个节点

scp -r /cloud/hadoop-2.2.0 hadoop@slave1:/cloud/

scp -r /cloud/hadoop-2.2.0 hadoop@slave2:/cloud/

 

4.5启动集群

4.5.1启动zookeeper集群(分别在masterslave1slave2上启动zk

cd /cloud/zookeeper-3.4.5/bin/

./zkServer.sh start

#查看状态:一个leader,两个follower

./zkServer.sh status

4.5.2启动journalnode(在itcast01上启动所有journalnode,注意:是调用的hadoop-daemons.sh这个脚本,注意是复数s的那个脚本)

cd /cloud/hadoop-2.2.0

sbin/hadoop-daemons.sh start journalnode

#运行jps命令检验,itcast04itcast05itcast06上多了JournalNode进程

4.5.3格式化HDFS

#master上执行命令:

hadoop namenode -format

#格式化后会在根据core-site.xml中的hadoop.tmp.dir配置生成个文件,这里我配置的是/cloud/hadoop-2.2.0/data,然后将/cloud/hadoop-2.2.0/data拷贝到slave1/cloud/hadoop-2.2.0/下。

scp -r tmp/ slave1:/cloud/hadoop-2.2.0/

4.5.4格式化ZK(itcast01上执行即可)

hdfs zkfc -formatZK

4.5.5启动HDFS(itcast01上执行)

sbin/start-dfs.sh

 

4.5.6启动YARN(#####注意#####:是在itcast03上执行start-yarn.sh,把namenoderesourcemanager分开是因为性能问题,因为他们都要占用大量资源,所以把他们分开了,他们分开了就要分别在不同的机器上启动)

sbin/start-yarn.sh

0 0