单机基于docker搭建hadoop2.7.1 spark1.7 高可用集群
来源:互联网 发布:win8无法运行软件 编辑:程序博客网 时间:2024/06/05 10:16
获取ubuntu镜像
sudo docker pull ubuntu
分别下载 spark1.7 hadoop2.7.1 scala1.1 zookeeper3.4.6 jdk1.8 解压后放置本地文件夹用于挂载到容器中
并在文件夹下创建文件authorized_keys
hosts
本例目录使用/home/docker/config
启动容器
sudo docker run --name installspark -v /home/docker/config/:/config -it ubuntu:14.04
安装
启动后在容器内/config文件夹下能够看到放置的安装文件
安装jdk,scala :
vim ~/.bashrc追加:
/usr/sbin/sshdcat /config/hosts > /etc/hostscat /config/authorized_keys > /root/.ssh/authorized_keysexport JAVA_HOME=/usr/lib/jvm/java-8-sunexport PATH=${JAVA_HOME}/bin:$PATHexport HADOOP_HOME=/opt/hadoopexport PATH=${HADOOP_HOME}/bin:$PATHexport SCALA_HOME=/opt/scala export PATH=${SCALA_HOME}/bin:$PATHexport SPARK_HOME=/opt/sparkexport PATH=${SPARK_HOME}/bin:$PATH
拷贝 spark/hadoop/zookeeper到/opt下
安装hadoop:
创建文件夹:/opt/hadoop/namenode /opt/hadoop/datanode /opt/hadoop/tmp /opt/hadoop/journalroot@nn1:/opt/hadoop/etc/hadoop# vim hadoop-env.sh修改:
export JAVA_HOME=/usr/lib/jvm/java-8-sunroot@nn1:/opt/hadoop/etc/hadoop# vim core-site.xml
添加:
<property><name>fs.defaultFS</name><value>hdfs://ns1</value></property><property><name>hadoop.tmp.dir</name><value>/opt/hadoop/tmp</value></property><property><name>ha.zookeeper.quorum</name><value>dnzk1:2181,dnzk2:2181,dnzk3:2181</value></property>
root@nn1:/opt/hadoop/etc/hadoop#vim hdfs-site.xml
添加:
<property> <name>dfs.datanode.data.dir</name> <value>file:/opt/hadoop/datanode</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/opt/hadoop/namenode</value> </property><property><name>dfs.nameservices</name><value>ns1</value></property><property><name>dfs.ha.namenodes.ns1</name><value>nn1,nn2</value></property><property><name>dfs.namenode.rpc-address.ns1.nn1</name><value>nn1:9000</value></property><property><name>dfs.namenode.http-address.ns1.nn1</name><value>nn1:50070</value></property><property><name>dfs.namenode.rpc-address.ns1.nn2</name><value>nn2:9000</value></property><property><name>dfs.namenode.http-address.ns1.nn2</name><value>nn2:50070</value></property><property><name>dfs.namenode.shared.edits.dir</name><value>qjournal://dnzk1:8485;dnzk2:8485;dnzk3:8485/ns1</value></property><property><name>dfs.journalnode.edits.dir</name><value>/opt/hadoop/journal</value></property><property> <name>dfs.journalnode.http-address</name> <value>0.0.0.0:8480</value></property><property> <name>dfs.journalnode.rpc-address</name> <value>0.0.0.0:8485</value></property><property><name>dfs.ha.automatic-failover.enabled</name><value>true</value></property><property><name>dfs.client.failover.proxy.provider.ns1</name><value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value></property><property><name>dfs.ha.fencing.methods</name><value> sshfence shell(/bin/true)</value></property><property><name>dfs.ha.fencing.ssh.private-key-files</name><value>/root/.ssh/id_rsa</value></property><property><name>dfs.ha.fencing.ssh.connect-timeout</name><value>30000</value></property><property> <name>dfs.permissions</name> <value>false</value></property>
vim mapred-site.xml
添加:
<property><name>mapreduce.framework.name</name><value>yarn</value></property>
root@nn1:/opt/hadoop# vim /opt/hadoop/etc/hadoop/yarn-site.xml
添加:
<property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>RM_HA_ID</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>rm1</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>rm2</value> </property> <property> <name>yarn.resourcemanager.recovery.enabled</name> <value>true</value> </property><pre name="code" class="html"> <property><name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>dnzk1:2181,dnzk2:2181,dnzk3:2181</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>
root@nn1:/opt/hadoop# vim /opt/hadoop/etc/hadoop/slaves
添加:
dnzk1dnzk2dnzk3
安装spark
root@nn1:/opt/spark/conf# vim spark-env.sh添加:
export SPARK_MASTER_IP=nn1 export SPARK_WORKER_MEMORY=256mexport JAVA_HOME=/usr/lib/jvm/java-8-sunexport SCALA_HOME=/opt/scalaexport SPARK_HOME=/opt/sparkexport HADOOP_CONF_DIR=/opt/hadoop/etc/hadoopexport SPARK_LIBRARY_PATH=$$SPARK_HOME/libexport SCALA_LIBRARY_PATH=$SPARK_LIBRARY_PATHexport SPARK_WORKER_CORES=1export SPARK_WORKER_INSTANCES=1export SPARK_MASTER_PORT=7077
root@nn1:/opt/spark/conf# vim slaves
添加:
nn1nn2rm1rm2dnzk1dnzk2dnzk3
安装zookeeper
创建文件夹 /opt/zookeeper/tmp
创建文件 /opt/zookeeper/tmp/myid
echo 1 > /opt/zookeeper/tmp/myid
root@nn1:/opt/zookeeper/conf# vim zoo.cfg
修改
dataDir=/opt/zookeeper/tmpserver.1=dnzk1:2888:3888server.2=dnzk2:2888:3888server.3=dnzk3:2888:3888
生成密钥
ssh-keygen -t dsa
追加id_dsa.pub到宿主机的/home/docker/config/authorized_keys文件
root@nn1:/opt/hadoop# cat ~/.ssh/id_dsa.pub
执行
sudo docker commit -m "namenode1" installspark ubuntu:ns1
修改本地宿主机/home/docker/config/hosts文件
添加
172.17.0.11nn1172.17.0.12 nn2172.17.0.13 rm1172.17.0.14 rm2172.17.0.15 dnzk1172.17.0.16 dnzk2172.17.0.17 dnzk3
启动docker
sudo docker run --name dnzk1 -h dnzk1 --net=none -p 2185:2181 -p 50075:50070 -p 9005:9000 -p 8485:8485 -p 7075:7077 -p 2885:2888 -v /home/docker/config/:/config -it spark1_7-hadoop2_7_1-scala1_1:basicsudo docker run --name dnzk2 -h dnzk2 --net=none -p 2186:2181 -p 50076:50070 -p 9006:9000 -p 8486:8485 -p 7076:7077 -p 2886:2888 -v /home/docker/config/:/config -it spark1_7-hadoop2_7_1-scala1_1:basicsudo docker run --name dnzk3 -h dnzk3 --net=none -p 2186:2181 -p 50076:50070 -p 9006:9000 -p 8486:8485 -p 7076:7077 -p 2887:2888 -v /home/docker/config/:/config -it spark1_7-hadoop2_7_1-scala1_1:basicsudo docker run --name nn1 -h nn1 --net=none -p 2181:2181 -p 50071:50070 -p 9001:9000 -p 8481:8485 -p 7071:7077 -p 2881:2888 -v /home/docker/config/:/config -it spark1_7-hadoop2_7_1-scala1_1:basicsudo docker run --name nn2 -h nn2 --net=none -p 2182:2181 -p 50072:50070 -p 9002:9000 -p 8482:8485 -p 7072:7077 -p 2882:2888 -v /home/docker/config/:/config -it spark1_7-hadoop2_7_1-scala1_1:basicsudo docker run --name rm1 -h rm1 --net=none -p 2183:2181 -p 50073:50070 -p 9003:9000 -p 8483:8485 -p 7073:7077 -p 2883:2888 -v /home/docker/config/:/config -it spark1_7-hadoop2_7_1-scala1_1:basicsudo docker run --name rm2 -h rm2 --net=none -p 2184:2181 -p 50074:50070 -p 9004:9000 -p 8484:8485 -p 7074:7077 -p 2884:2888 -v /home/docker/config/:/config -it spark1_7-hadoop2_7_1-scala1_1:basicdnzk2(执行echo 2 > /opt/zookeeper/tmp/myid),dnzk2(执行echo 3 > /opt/zookeeper/tmp/myid)
配置网络
sudo pipework docker0 -i eth0 nn1 172.17.0.11/16sudo pipework docker0 -i eth0 nn2 172.17.0.12/16sudo pipework docker0 -i eth0 rm1 172.17.0.13/16sudo pipework docker0 -i eth0 rm2 172.17.0.14/16sudo pipework docker0 -i eth0 dnzk1 172.17.0.15/16sudo pipework docker0 -i eth0 dnzk2 172.17.0.16/16sudo pipework docker0 -i eth0 dnzk3 172.17.0.17/16
在dnzk1/dnzk2/dnzk3上启动zookeeper和 hadoop journal启动hadoop集群
/opt/zookeeper/bin/zkServer.sh start/opt/hadoop/sbin/hadoop-daemon.sh start journalnode
在nn1上格式化zookeeper启动和format hadoop
/opt/hadoop/bin/hdfs namenode -format
scp -r /opt/hadoop/namenode/ nn2:/opt/hadoop/
或
/opt/hadoop/bin/hdfs namenode -bootstrapStandby
/opt/hadoop/bin/hdfs zkfc -formatZK
/opt/hadoop/sbin/start-dfs.sh在rm1上启动yarn
/opt/hadoop/sbin/start-yarn.sh在rm2上启动
/opt/hadoop/sbin/yarn-daemon.sh start resourcemanager
启动spark
/opt/spark/sbin/start-all.sh
查看:
http://172.17.0.11:50070 (active)
http://172.17.0.12:50070(standby)
启动后集群服务情况
nn1 172.17.0.11 jdk、hadoop NameNode、DFSZKFailoverController(zkfc)
nn2 172.17.0.12 jdk、hadoop NameNode、DFSZKFailoverController(zkfc)
rm1 172.17.0.13 jdk、hadoop ResourceManager
rm2 172.17.0.14 jdk、hadoop ResourceManager
dnzk1 172.17.0.15 jdk、hadoop、zookeeper DataNode、NodeManager、JournalNode、QuorumPeerMain
dnzk2 172.17.0.16 jdk、hadoop、zookeeper DataNode、NodeManager、JournalNode、QuorumPeerMain
dnzk3 172.17.0.17 jdk、hadoop、zookeeper DataNode、NodeManager、JournalNode、QuorumPeerMain
- 单机基于docker搭建hadoop2.7.1 spark1.7 高可用集群
- 基于Docker搭建ActiveMQ的高可用集群
- docker~swarm搭建docker高可用集群
- Hadoop2.6.5高可用集群搭建
- hadoop2.x高可用集群的搭建
- Hadoop2.5的HDFS集群HA搭建(高可用集群)
- docker1.7 搭建spark1.4.0-hadoop2.6集群
- spark1.6.0+Hadoop2.6.0+Scala-2.11.7 搭建分布式集群
- Hadoop2.7与Spark1.6的集群搭建
- spark1.2.0+hadoop2.4.0集群环境搭建
- 搭建基于heartbeat的高可用集群
- Hadoop2.7.2之集群搭建(高可用)
- hadoop2.0生产环境高可用集群原理和搭建
- Hadoop2.7.2集群搭建详解(高可用)
- hadoop2.6.5 HDFS的高可用集群搭建
- hadoop2.6.5 HDFS Mapreduce的高可用集群搭建
- centos7 搭建ha(高可用)hadoop2.7.3集群
- centos7 搭建ha(高可用)hadoop2.7.3集群
- Accordion手风琴式的效果
- 【bzoj1045】 HAOI2008糖果传递 中位数
- Socket 的使用
- springmvc从页面带过来的参数乱码
- 2015-09-25
- 单机基于docker搭建hadoop2.7.1 spark1.7 高可用集群
- 网龙笔试题
- Linux中find常见用法
- hibernate学习之路(hibernate入门配置)
- android中的lint工具
- Unity5.2的广告类使用(Unity-ads)
- Linux设备模型--kobject & kset
- ios 随记
- MDB,Oracle空间数据库访问及图层数据的处理