Hadoop HA搭建

来源:互联网 发布:画图软件使用方法 编辑:程序博客网 时间:2024/06/17 22:06
一 、Hadoop Ha 安装准备工作
1.zookeeper集群
master
slave1
slave2
Hadoop集群
master Namenode1 ResourceManager1 Journalnode1
slave1 Namenode1 ResourceManager2 Journalnode2
slave2 DataNode1
slave3 DataNode2
2.修改主机名
hostnamectl set-hostname master(slave1,slave2,slave3)
3.修改/etc/hosts文件
192.168.*.*  master(slave1,slave2,slave3)
4.设置ssh免密码登录
ssh-keygen  一直回车
ssh-copy-id master(slave1,slave2,slave3)
5.安装jdk
用xftp上传jdk、zookeeper、Hadoop包到/usr/local下
tar xzvf jdk*
mv jdk* java 
修改环境变量/etc/profile 
export JAVA_HOME=/usr/local/java
export JRE_HOME=/usr/java/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin
生成环境变量
source /etc/profile
6.将jdk拷贝到其他节点
scp -r /usr/local/java slave1:/usr/local
7.同步时间
yum install -y ntp  
ntpdate 210.72.145.44 
二、安装zookeeper
1.解压zookeeper
tar xzvf zookeeper*
mv zookeeper* zookeeper
2.修改配置文件
cd /usr/local/zookeeper/conf
mv zoo* zoo.cfg
vi zoo.cfg
修改 dataDir=/usr/local/zookeeper/data
添加server.1=master:2888:3888
   server.1=master:2888:3888
server.1=master:2888:3888
mkdir -p /usr/local/zookeeper/data
echo 1(2.3) > /usr/local/zookeeper/data/myid
3.关闭防火墙
systemctl stop firewalld
systemctl disable firewalld
关闭SELINUX
vi /etc/selinux/config
SELINUX=enfocing修改为diabled
4.修改环境变量/etc/profile
export ZOOKEEPER_HOME=/usr/local/zookeeper
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin
生成环境变量
source /etc/profile
5.将配置好的zookeeper拷贝到其他节点,同时修改各自的myid文件
scp -r /usr/local/zookeeper slave1:/usr/local

scp -r /etc/profile slave1:/etc/profile

6.开启2181端口

7.关闭防火墙

三、Hadoop集群安装
1.解压
tar xzvf haooop*
mv hadoop* hadoop
2.修改环境变量
cd /etc/profile
export HADOOP_HOME=/usr/local/hadoop
#export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib:$HADOOP_PREFIX/lib/native"
export LD_LIBRARY_PATH=$HADOOP_HOME/lib/native
export HADOOP_COMMON_LIB_NATIVE_DIR=/usr/local/hadoop/lib/native
export HADOOP_OPTS="-Djava.library.path=/usr/local/hadoop/lib"
#export HADOOP_ROOT_LOGGER=DEBUG,console
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
3.修改配置文件
cd /usr/local/hadoop/etc/hadoop
(1)修改hadoo-env.sh
vi hadoop-env.sh
export JAVA_HOME=/usr/local/java
(2)修改core-site.xml
<configuration>
<!-- 指定hdfs的nameservice为ns1 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<!-- 指定hadoop临时目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>master:2181,slave1:2181,slave2:2181</value>
</property>
</configuration>
(3)修改hdfs-site.xml<configuration> 
   <!--指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 -->
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>

<!-- ns1下面有两个NameNode,分别是nn1,nn2 -->
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn1</value>
</property>

<!-- ns1的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>master:9000</value>
</property>
<!-- ns1的http通信地址 -->
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>master:50070</value>
</property>

<!-- nn2的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>slave1:9000</value>
</property>
<!-- nn2的http通信地址 -->
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>slave1:50070</value>
</property>

<!-- 指定NameNode的日志在JournalNode上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master:8485;slave1:8485/ns1</value>
</property>
<!-- 指定JournalNode在本地磁盘存放数据的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/usr/local/hadoop/journal</value>
</property>


<!-- 开启NameNode失败自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>

<!-- 配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>

<!-- 使用sshfence隔离机制时需要ssh免登陆 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>

<!-- 配置sshfence隔离机制超时时间 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
(4)修改mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(6)修改yarn-site.xml
<configuration>
   <!-- 开启RM高可靠 -->
<property>
  <name>yarn.resourcemanager.ha.enabled</name>
  <value>true</value>
</property>

        <!-- 指定RM的cluster id -->
<property>
  <name>yarn.resourcemanager.cluster-id</name>
  <value>yrc</value>
</property>

<!-- 指定RM的名字 -->
<property>
  <name>yarn.resourcemanager.ha.rm-ids</name>
  <value>rm1,rm2</value>
</property>

<!-- 分别指定RM的地址 -->
<property>
  <name>yarn.resourcemanager.hostname.rm1</name>
  <value>master</value>
</property>
<property>
  <name>yarn.resourcemanager.hostname.rm2</name>
  <value>slave1</value>
</property>

<!-- 指定zk集群地址 -->
<property>
  <name>yarn.resourcemanager.zk-address</name>
  <value>master:2181,slave1:2181,slave2:2181</value>
</property>

<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
</configuration>
(6)修改slaves
slave2

slave3

(7) cd /usr/local/hadoop
           mkdir tmp

4、将配置好的hadoop拷贝到其他节点
scp -r /usr/local/hadoop slave1(slave2,slave3):/usr/local
5.scp -r /etc/profile slave1(slave2,slave3):/etc/profile
source /etc/profile
四、启动zookeeper集群
zkServer.sh start 打开zookeeper
zkServer.sh status
zkServer.sh stop  关闭zookeeper
zkServer.sh restart  重启zookeeper
五、在master和slave1上启动Journalnode
hadoop-daemon.sh start journalnode
六、格式化HDFS(master)
1. hdfs namenode -format
2. 格式化zookeeper
hdfs zkfc -formatZK
七、在master上启动Hadoop集群
start-all.sh

























原创粉丝点击