fully distributed(hadoop&hbase)
来源:互联网 发布:国内编程语言排行 编辑:程序博客网 时间:2024/06/08 05:12
最近在学习hadoop,mark下学习成果;
此文内容由笔者通过参考传智老师笔记,baidu、官方文档实践完成..
版本:hadoop-2.4.1、hbase-1.0.0、zookeeper-3.4.5
目标:完全分布式hadoop、hbase集群,其中NameNode、Resourcemanager、HMaster实现HA,做到自动失效恢复;
环境:VirtualBox4.3.18、centos6.5(用vbox虚拟7台centos,每台分配1G内存)、jdk1.8;
主机 ip安装软件运行进程
node01 192.168.8.101 jdk、hadoop、hbaseResourceManager、NameNode、HMaster、DFSZKFailoverController
node02 192.168.8.102 jdk、hadoop、hbaseResourceManager、NameNode、HMaster、DFSZKFailoverController
node03 192.168.8.103 jdk、hadoop、hbaseNodeManager、DataNode、HRegionServer
node04 192.168.8.104 jdk、hadoop、hbaseNodeManager、DataNode、HRegionServer
node05 192.168.8.105 jdk、hadoop、zookeeperNodeManager、DataNode、QuorumPeerMain、JournalNode
node06 192.168.8.106 jdk、hadoop、zookeeperNodeManager、DataNode、QuorumPeerMain、JournalNode
node07 192.168.8.107 jdk、hadoop、zookeeperNodeManager、DataNode、QuorumPeerMain、JournalNode
1.两个NameNode,一个是active状态,另一个是standby状态,状态由zookeeper进行协调。
2.两个Resourcemanager,一个是active状态,另一个是standby状态,状态由zookeeper进行协调。
3.两个HMaster,一个是active状态,另一个是standby状态,状态由zookeeper进行协调。
配置:
1、前期:配置虚拟机的主机名、IP、hosts、防火墙、ssh免登录;安装jdk、配置profile;
2、zookeeper集群:
2.1、修改zoo.cfg
dataDir=/software/zookeeper-3.4.5/data
server.1=node05:2888:3888
server.2=node06:2888:3888
server.3=node07:2888:3888
2.2、新建myid
路径./zookeeper-3.4.5/data下新建文件myid,并根据zk机器顺序echo >> 1/2.../n
3、hadoop集群:
3.1 修改hadoop-env.sh
export JAVA_HOME=/software/jdk1.8.0_40
3.2 修改core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/software/hadoop-2.4.1/tmp</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>node05:2181,node06:2181,node07:2181</value>
</property>
</configuration>
3.3 修改hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>node01:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>node01:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>node02:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>node02:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node05:8485;node06:8485;node07:8485/ns1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/software/hadoop-2.4.1/journal</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
3.4 修改mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
3.5 修改yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yrc</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>node01</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>node02</value>
</property>
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.store.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>node05:2181,node06:2181,node07:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
3.6修改slaves
node03
node04
node05
node06
node07
4.1把hadoop的hdfs-site.xml和core-site.xml copy到hbase/conf下(小疑惑)
4.2修改hbase-env.sh
export JAVA_HOME=/software/jdk1.8.0_40
export HBASE_MANAGES_ZK=false
4.3修改hbase-site.xml
<configuration>
<property>
<name>hbase.zookeeper.quorum</name>
<value>node05:2181,node06:2181,node07:2181</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://ns1/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
</configuration>
4.4修改regionservers
node03
node04
5、copy工程
将配置好的hadoop、hbase、zookeeper copy到对应机器上;
如:scp -r /software/hbase-1.0.0/ node02:/software/;
6、启动
6.1启动zookeeper
zkServer.sh start (分别在三台zookeeper机器上启动)
6.2启动journalnode
hadoop-daemon.sh start journalnode (机器node05、node06、node7)
6.3格式化hdfs
hdfs namenode -format (在node01上执行,并把core-site.xml中的hadoop.tmp.dir配置的文件路径下生成的文件拷贝到node02)
6.4格式化ZK
hdfs zkfc -formatZK (在node01执行)
6.5启动hdfs (node01执行)
start-dfs.sh
6.6启动yarn (node01执行)
start-yarn.sh
6.7启动hbase (node01执行)
start-hbase.sh
6.8启动 standby Resourcemanager (node02执行)
yarn-daemon.sh start resourcemanager
6.9启动standby Hmaster(node02执行)
hbase-daemon.sh start master
7、校验HA
1、kill掉(或重启机器)对应active的Resourcemanager、Namenode、Hmaster进程,校验是否自动切换standby的进程为active;
2、重新启动kill掉的进程,其状态变为standby(注:zkfc应先于Namenode启动;),此时表明HA配置成功;
注:Namenode、zkfc进程启动shell:
hadoop-daemon.sh start zkfc
hadoop-daemon.sh start namenode
8、关闭顺序
1、stop-hbase.sh
2、stop-dfs.sh
3、stop-yarn.sh (执行此shell后node02里的Resourcemanager没有关闭,暂未查到原因,在node02上执行下yarn-daemon.sh stop resourcemanager可关闭)
4、zkServer.sh stop
9、延伸
一个nameservice存储元数据有上限,其影响整个集群规模的扩展,Federation如何配置,如何使用,期待高手们的配置教程;
一个Resourcemanager可管理的集群(或说并行job数量)也是有上限的,这个的解决方案是?
10、more
linux小白求一步到位的集群启动的shell;
查阅资料时发现hbase存在一个jdk测试兼容问题,1.0版本在jdk8上并未充分的进行运行测试;这就提醒我们,在部署真正生产环境时需留意官方文档,保证hadoop、hbase、zookeeper、jdk是相互兼容且得到充分的测试;
0 0
- fully distributed(hadoop&hbase)
- hbase Fully-distributed搭建
- hbase-- Fully Distributed Install
- Debugging Nutch With Hbase on Hadoop Fully Distributed Mode
- Hadoop Fully distributed mode
- Hadoop 完全分布式 Fully-Distributed Mode
- Hadoop installation. Hadoop fully distributed install (2.6.0)
- Distributed System: ZooKeeper系列之四 (Hadoop和HBase的应用)
- Distributed Hadoop Execution
- Using Hadoop Distributed Cache
- Using Hadoop Distributed Cache
- hadoop distributed cache
- The Hadoop Distributed Filesystem
- Hadoop Distributed File System (HDFS)
- The Hadoop Distributed File System
- The Hadoop Distributed File System
- The Hadoop Distributed File System
- Hadoop Distributed File System 简介
- osx升级到10.10后,用pod install报错最终解决办法
- HDU 5207Greatest Greatest Common Divisor
- Lua_键盘事件
- 湘潭大学oj1213
- 集群:集群基础概念
- fully distributed(hadoop&hbase)
- vim保存折叠信息的方法
- Spring MVC JSP页面无法加载.jpg .css .js的解决方案
- Lua_重力加速器
- 《Linux设备驱动开发详解(第3版)》(即《Linux设备驱动开发详解:基于最新的Linux 4.0内核》)前言
- eclipse导入项目报错 :Unbound classpath container: 'JRE System Library'
- 图谱(Graph)API 概要
- 菜鸟变黑客终极教程
- POJ-1056(Trie)