hadoop2.7.3 HA YARN 环境搭建

来源:互联网 发布:手机爱淘宝1元口令 编辑:程序博客网 时间:2024/03/29 04:20

hadoop2.7.3 HA YARN 环境搭建

本文主要参考:
http://www.cnblogs.com/captainlucky/p/4654923.html

本文实现前提:
1、已经能够搭建hadoop HA集群,并且正常运行。如果还只是基本的hadoop的完全分布式集群,建议看一下关于hadoop HA hdfs环境搭建的文章:
http://blog.csdn.net/wild46cat/article/details/53506139

好啦,下面上货。

首先是修改配置文件:mapred-site.xml
<configuration><property>        <name>mapreduce.framework.name</name>        <value>yarn</value>    </property>    <property>        <name>mapreduce.jobhistory.address</name>        <value>0.0.0.0:10020</value>    </property>    <property>        <name>mapreduce.jobhistory.webapp.address</name>        <value>0.0.0.0:19888</value>    </property></configuration>

然后是yarn-site.xml

<configuration><!-- Site specific YARN configuration properties -->    <property>        <name>yarn.nodemanager.aux-services</name>        <value>mapreduce_shuffle</value>    </property>    <property>        <name>yarn.nodemanager.auxservices.mapreduce.shuffle.class</name>        <value>org.apache.hadoop.mapred.ShuffleHandler</value>    </property>    <property>        <name>yarn.nodemanager.local-dirs</name>        <value>/home/hadoop/yarn/local</value>    </property>    <property>        <name>yarn.nodemanager.log-dirs</name>        <value>/home/hadoop/yarn/log</value>    </property>    <property>        <name>mapreduce.shuffle.port</name>        <value>23080</value>    </property>    <property>        <name>yarn.resourcemanager.ha.enabled</name>        <value>true</value>    </property>    <property>        <name>yarn.resourcemanager.cluster-id</name>        <value>myhdfs</value>    </property>    <property>        <name>yarn.resourcemanager.ha.rm-ids</name>        <value>nn1,nn2</value>    </property><!-- in every host the id is diferent -->    <property>        <name>yarn.resourcemanager.ha.id</name>        <value>nn1</value>    </property>    <property>        <name>ha.zookeeper.quorum</name>        <value>host1:2181,host2:2181,host3:2181</value>    </property>    <property>        <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>        <value>true</value>    </property>    <property>        <name>yarn.resourcemanager.hostname.nn1</name>        <value>host1</value>    </property>    <property>        <name>yarn.resourcemanager.hostname.nn2</name>        <value>host2</value>    </property>    <property>        <name>yarn.resourcemanager.recovery.enabled</name>        <value>true</value>    </property>    <property>        <name>yarn.resourcemanager.store.class</name>        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>    </property>    <property>        <name>yarn.resourcemanager.zk-state-store.address</name>        <value>host1:2181,host2:2181,host3:2181</value>    </property>    <property>        <name>yarn.resourcemanager.zk-address</name>        <value>host1:2181,host2:2181,host3:2181</value>    </property>    <property>        <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>        <value>5000</value>    </property>    <property>        <name>yarn.nodemanager.resource.memory-mb</name>        <value>1024</value>    </property><!-- nn1 -->    <property>        <name>yarn.resourcemanager.address.nn1</name>        <value>host1:8032</value>    </property>    <property>        <name>yarn.resourcemanager.scheduler.address.nn1</name>        <value>host1:8030</value>    </property>    <property>        <name>yarn.resourcemanager.webapp.address.nn1</name>        <value>host1:8088</value>    </property>    <property>        <name>yarn.resourcemanager.resource-tracker.address.nn1</name>        <value>host1:8031</value>    </property>    <property>        <name>yarn.resourcemanager.admin.address.nn1</name>        <value>host1:8033</value>    </property>    <property>        <name>yarn.resourcemanager.ha.admin.address.nn1</name>        <value>host1:23142</value>    </property> <!-- nn2 -->    <property>        <name>yarn.resourcemanager.address.nn2</name>        <value>host2:8032</value>    </property>    <property>        <name>yarn.resourcemanager.scheduler.address.nn2</name>        <value>host2:8030</value>    </property>    <property>        <name>yarn.resourcemanager.webapp.address.nn2</name>        <value>host2:8088</value>    </property>    <property>        <name>yarn.resourcemanager.resource-tracker.address.nn2</name>        <value>host2:8031</value>    </property>    <property>        <name>yarn.resourcemanager.admin.address.nn2</name>        <value>host2:8033</value>    </property>    <property>        <name>yarn.resourcemanager.ha.admin.address.nn2</name>        <value>host2:23142</value>    </property>      <property>        <name>yarn.client.failover-proxy-provider</name>        <value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>    </property>     <property>        <name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name>        <value>/yarn-leader-election</value>    </property> </configuration>



需要注意的是,在host1中配置好后,需要把mapred-site.xml和yarn-site.xml拷贝到host2和host3中。拷贝后,需要修改yarn-site.xml中的标识当前节点的配置
<!-- in every host the id is diferent -->
    <property>
        <name>yarn.resourcemanager.ha.id</name>
        <value>nn1</value>
    </property>

配置map-env.sh和yarn-env.sh两个文件(略),在配置普通的hadoop完全分布式的时候应该已经配置好了。

下面启动hdfs和yarn。
start-dfs.sh
start-yarn.sh

启动standby的yarn,先进入standby的主机,这里是host2,然后使用命令
yarn-daemon.sh start resourcemanager

查看各个节点的状态下面是截图:


然后是访问一下web 8088:


在看一下standby节点

现在模拟host1的resourcemanager出现故障



现在再看一下host2中resourcemanager的状态



可以看到host2已经接手了resourcemanager的任务。下面再看一下web端。
首选host1的8088端口已经访问不了了。然后看一下host2的8088端口。



可能遇到的问题:
1、在stop-yarn.sh之后,再次启动start-yarn.sh启动不起来,看一下日志文件中显示的是8088端口已经启动。尝试在web上访问一下,还真的能访问。这时候需要结束掉进程:
ps aux | grep -i resourcemanager

2、启动nodeManager之后,会出现自动退出的现象。这个时候需要检查一下
<property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>1024</value>
    </property>
这个配置的value是不是大于1024,如果小于1024的话,那么就会使nodemanager自动退出。
0 0
原创粉丝点击