在基于docker的Hadoop集群上搭建Spark
来源:互联网 发布:ie10 淘宝图片不显示 编辑:程序博客网 时间:2024/05/17 03:05
Install Hadoop Clustering
1. Run a docker container within boot2docker vm
docker@default:~$ docker run -it -v /c/Users/liming.zhu/boot2dockerShareFolder:/hostFolder --name hadoopYarn ubuntu:14.04
2. Install java within the docker container
Tar zxvf /hostfolder/jdk-8u20-linux-x64.tar.gz –C /root/JDK
3. Export java environment variable
vi /root/.bashrc
export JAVA_HOME=/usr/local/JDK/jdk1.8.0_20/
PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
4. Install Hadoop
tar zxvf /hostFolder/hadoop-2.7.1.tar.gz -C /usr/local/Hadoop
5. Install ssh
apt-get install ssh
ssh-keygen -t rsa -P '' -f ~/.ssh/id_dsa
cd .ssh/
cat id_dsa.pub >> authorized_keys
6. Auto start sshd
Vi ~/.bashrc
#autorun
/usr/sbin/sshd
7. Export Hadoop environment variables
vi /root/.bashrc
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.7.1
export HADOOP_CONFIG_HOME=$HADOOP_HOME/etc/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
8. Create Hadoop work dir
9. Create core-site.xml
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/work/tmp</value>
<description>A base for other temporarydirectories.</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master/</value>
<final>true</final>
</property>
10. Create hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>2</value>
<final>true</final>
<description>Defaultblock replication.
Theactual number of replications can be specified when the file is created.
Thedefault is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/work/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/usr/local/hadoop/work/datanode</value>
</property>
11. Edit the yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/usr/local/hadoop/work/datanode</value>
</property>
12. Format name node
hadoop namenode –format
13. Commit the container
docker commit -m "hadoop install" 1cecd529ddc5ubuntu:hadoop2
14. Run 3 contains
docker run -it -h master --name master3 -v /c/Users/liming.zhu/boot2dockerShareFolder/:/hostFolder-p 8088:8088 ubuntu:hadoop2
docker run -it -h salve3 --name salve3 -v/c/Users/liming.zhu/boot2dockerShareFolder/:/hostFolder ubuntu:hadoop2
docker run -it -h salve4 --name salve4 -v/c/Users/liming.zhu/boot2dockerShareFolder/:/hostFolder ubuntu:hadoop2
15. Modify slaves file
Cd /usr/local/hadoop/hadoop-2.7.1/etc/Hadoop
Vi salves
Salve3
Salve4
16. Change hosts file in all the containers
172.17.0.19 salve4
172.17.0.18 master
Install Spark
1. Copy spark tar to user home folder
Cp /hostFolder/spark-1.4.1-bin-without-hadoop.tgz ~/spark
Cd ~/spark
Tar zxvf spark-1.4.1-bin-without-hadoop.tgz
Mv spark-1.4.1-bin-without-hadoop spark1.4.1
Cd spark1.4.1
2. Modify yarn-site.xml in order to increase the containermemory
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>4.2</value>
</property>
3. Config spark-yarn variable
Vi conf/spark-env.sh
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
export YARN_CONF_DIR=/usr/local/hadoop/hadoop-2.7.1/etc/Hadoop
4. Launch spark-shell
Bin/spark-shell –master yarn-client
5. Check the web ui
http://192.168.59.103:8088/
- 在基于docker的Hadoop集群上搭建Spark
- 基于Vmware Workstation上的hadoop集群搭建Spark集群
- 基于docker的hadoop HA 集群搭建
- 基于Docker的Hadoop集群快速搭建
- 在Docker上使用Weave搭建Hadoop和Spark跨主机容器集群
- 基于Docker搭建Hadoop集群
- 一步一步详细搭建Spark集群在docker上
- 在Docker中运行Hadoop+Spark集群
- 基于CentOS的Hadoop和Spark分布式集群搭建过程
- 基于Docker搭建Hadoop集群(ubuntu操作系统)
- 基于Docker快速搭建Hadoop集群
- 在Docker下搭建Spark+HDFS集群
- Ubuntu上搭建hadoop和spark集群
- Docker搭建hadoop集群
- 在基于Yarn的集群上运行Spark程序
- 搭建spark-hadoop集群
- hadoop+spark集群搭建
- Hadoop+Spark集群搭建
- Unity 3D第二更
- poi_日期格式
- ANDROID 工程导入报 NO RESOURCE FOUND 一类编译错误的解决
- 阴影box-shadow
- 面试题 - 字符串逆序
- 在基于docker的Hadoop集群上搭建Spark
- jquery封装 [返回顶部] 语句,简单调用即可
- 使用java代码判断用户使用哪种方式登录的。
- iOS开发UI篇—程序启动原理和UIApplication
- 修改Tomcat使用的JVM内存大小
- 美团面试题 将一个n X n二维数组逆时针旋转45度后打印
- 为什么内部类可以访问外部类的成员
- 安装hive后使用mysql作为数据库无法正常启动问题
- 情感入秋,静美如诗