docker镜像制作之dockercompose.yml文件---hadoop伪分布式
来源:互联网 发布:淘宝一件代发怎么加入 编辑:程序博客网 时间:2024/06/08 15:19
一、其实对于hadoop集群不是太适合放在docker服务器里面来跑,因为docker提倡容器和服务是1:1的关系,但是
hadoop提倡datanode和nodemanager在一个节点上(容器),但是当docker使用swarm之后,还是可以考虑将hadoop
集群的各个服务扔进容器里面。
二、构建hadoop集群的基础镜像
1.构建hadoop集群的基础镜像需要如下文件:
├── build.sh
├── Dockerfile
├── entrypoint.sh
├── hadoop-2.7.3.tar.gz
└── jdk-8u92-linux-x64.tar.gz
2.说明:hadoop-2.7.3.tar.gz和jdk-8u92-linux-x64.tar.gz需要提前下载后放在该目录下
3.Dockerfile文件的内容如下
FROM debian:jessie-backportsRUN apt-get update \ #&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends openjdk-8-jdk \ && rm -rf /var/lib/apt/lists/*ENV JAVA_VERSION jdk1.8.0_92COPY jdk-8u92-linux-x64.tar.gz /optRUN tar -zxvf /opt/jdk-8u92-linux-x64.tar.gz -C /optENV JAVA_HOME=/opt/$JAVA_VERSION/RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends net-tools curlRUN gpg --keyserver pool.sks-keyservers.net --recv-keys \ 07617D4968B34D8F13D56E20BE5AAA0BA210C095 \ 2CAC83124870D88586166115220F69801F27E622 \ 4B96409A098DBD511DF2BC18DBAF69BEA7239D59 \ 9DD955653083EFED6171256408458C39E964B5FF \ B6B3F7EDA5BA7D1E827DE5180DFF492D8EE2F25C \ 6A67379BEFC1AE4D5595770A34005598B8F47547 \ 47660BC98BC433F01E5C90581209E7F13D0C92B9 \ CE83449FDC6DACF9D24174DCD1F99F6EE3CD2163 \ A11DF05DEA40DA19CE4B43C01214CF3F852ADB85 \ 686E5EDF04A4830554160910DF0F5BBC30CD0996 \ 5BAE7CB144D05AD1BB1C47C75C6CC6EFABE49180 \ AF7610D2E378B33AB026D7574FB955854318F669 \ 6AE70A2A38F466A5D683F939255ADF56C36C5F0F \ 70F7AB3B62257ABFBD0618D79FDB12767CC7352A \ 842AAB2D0BC5415B4E19D429A342433A56D8D31A \ 1B5D384B734F368052862EB55E43CAB9AEC77EAF \ 785436A782586B71829C67A04169AA27ECB31663 \ 5E49DA09E2EC9950733A4FF48F1895E97869A2FB \ A13B3869454536F1852C17D0477E02D33DD51430 \ A6220FFCC86FE81CE5AAC880E3814B59E4E11856 \ EFE2E7C571309FE00BEBA78D5E314EEF7340E1CB \ EB34498A9261F343F09F60E0A9510905F0B000F0 \ 3442A6594268AC7B88F5C1D25104A731B021B57F \ 6E83C32562C909D289E6C3D98B25B9B71EFF7770 \ E9216532BF11728C86A11E3132CF4BF4E72E74D3 \ E8966520DA24E9642E119A5F13971DA39475BD5D \ 1D369094D4CFAC140E0EF05E992230B1EB8C6EFA \ A312CE6A1FA98892CB2C44EBA79AB712DE5868E6 \ 0445B7BFC4515847C157ECD16BA72FF1C99785DE \ B74F188889D159F3D7E64A7F348C6D7A0DCED714 \ 4A6AC5C675B6155682729C9E08D51A0A7501105C \ 8B44A05C308955D191956559A5CEE20A90348D47ENV HADOOP_VERSION 2.7.3COPY hadoop-$HADOOP_VERSION.tar.gz /optRUN tar -zxvf /opt/hadoop-$HADOOP_VERSION.tar.gz -C /optRUN rm /opt/hadoop-$HADOOP_VERSION.tar.gzRUN ln -s /opt/hadoop-$HADOOP_VERSION/etc/hadoop /etc/hadoopRUN cp /etc/hadoop/mapred-site.xml.template /etc/hadoop/mapred-site.xmlRUN mkdir /opt/hadoop-$HADOOP_VERSION/logsRUN mkdir /hadoop-dataENV HADOOP_PREFIX=/opt/hadoop-$HADOOP_VERSIONENV HADOOP_CONF_DIR=/etc/hadoopENV MULTIHOMED_NETWORK=1ENV USER=rootENV PATH $HADOOP_PREFIX/bin/:$PATHADD entrypoint.sh /entrypoint.shRUN chmod a+x /entrypoint.shENTRYPOINT ["/entrypoint.sh"]4.entrypoint.sh文件内容 如下
#!/bin/bash# Set some sensible defaultsexport CORE_CONF_fs_defaultFS=${CORE_CONF_fs_defaultFS:-hdfs://`hostname -f`:9000}function addProperty() { local path=$1 local name=$2 local value=$3 local entry="<property><name>$name</name><value>${value}</value></property>" local escapedEntry=$(echo $entry | sed 's/\//\\\//g') sed -i "/<\/configuration>/ s/.*/${escapedEntry}\n&/" $path}function configure() { local path=$1 local module=$2 local envPrefix=$3 local var local value echo "Configuring $module" for c in `printenv | perl -sne 'print "$1 " if m/^${envPrefix}_(.+?)=.*/' -- -envPrefix=$envPrefix`; do name=`echo ${c} | perl -pe 's/___/-/g; s/__/_/g; s/_/./g'` var="${envPrefix}_${c}" value=${!var} echo " - Setting $name=$value" addProperty /etc/hadoop/$module-site.xml $name "$value" done}configure /etc/hadoop/core-site.xml core CORE_CONFconfigure /etc/hadoop/hdfs-site.xml hdfs HDFS_CONFconfigure /etc/hadoop/yarn-site.xml yarn YARN_CONFconfigure /etc/hadoop/httpfs-site.xml httpfs HTTPFS_CONFconfigure /etc/hadoop/kms-site.xml kms KMS_CONFif [ "$MULTIHOMED_NETWORK" = "1" ]; then echo "Configuring for multihomed network" # HDFS addProperty /etc/hadoop/hdfs-site.xml dfs.namenode.rpc-bind-host 0.0.0.0 addProperty /etc/hadoop/hdfs-site.xml dfs.namenode.servicerpc-bind-host 0.0.0.0 addProperty /etc/hadoop/hdfs-site.xml dfs.namenode.http-bind-host 0.0.0.0 addProperty /etc/hadoop/hdfs-site.xml dfs.namenode.https-bind-host 0.0.0.0 addProperty /etc/hadoop/hdfs-site.xml dfs.client.use.datanode.hostname true addProperty /etc/hadoop/hdfs-site.xml dfs.datanode.use.datanode.hostname true # YARN addProperty /etc/hadoop/yarn-site.xml yarn.resourcemanager.bind-host 0.0.0.0 addProperty /etc/hadoop/yarn-site.xml yarn.nodemanager.bind-host 0.0.0.0 addProperty /etc/hadoop/yarn-site.xml yarn.nodemanager.bind-host 0.0.0.0 addProperty /etc/hadoop/yarn-site.xml yarn.timeline-service.bind-host 0.0.0.0 # MAPRED addProperty /etc/hadoop/mapred-site.xml yarn.nodemanager.bind-host 0.0.0.0fiif [ -n "$GANGLIA_HOST" ]; then mv /etc/hadoop/hadoop-metrics.properties /etc/hadoop/hadoop-metrics.properties.orig mv /etc/hadoop/hadoop-metrics2.properties /etc/hadoop/hadoop-metrics2.properties.orig for module in mapred jvm rpc ugi; do echo "$module.class=org.apache.hadoop.metrics.ganglia.GangliaContext31" echo "$module.period=10" echo "$module.servers=$GANGLIA_HOST:8649" done > /etc/hadoop/hadoop-metrics.properties for module in namenode datanode resourcemanager nodemanager mrappmaster jobhistoryserver; do echo "$module.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31" echo "$module.sink.ganglia.period=10" echo "$module.sink.ganglia.supportsparse=true" echo "$module.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both" echo "$module.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40" done > /etc/hadoop/hadoop-metrics2.propertiesfiexec $@5、build.sh 文件内容
#!/bin/shdocker build -t hadoop/base .
chmod +x build.sh
7、执行build.sh,创建hadoop集群的基础镜像
sh build.sh
三、构建hadoop集群的namenode镜像
1.构建hadoop集群的namenode镜像需要如下文件:
├── build.sh
├── Dockerfile
└── run.sh
2、Dockerfile文件的内容如下:
FROM hadoop/baseENV HDFS_CONF_dfs_namenode_name_dir=file:///hadoop/dfs/nameRUN mkdir -p /hadoop/dfs/nameVOLUME /hadoop/dfs/nameADD run.sh /run.shRUN chmod a+x /run.shCMD ["/run.sh"]3、run,sh文件内容如下
#!/bin/bashnamedir=`echo $HDFS_CONF_dfs_namenode_name_dir | perl -pe 's#file://##'`if [ ! -d $namedir ]; then echo "Namenode name directory not found: $namedir" exit 2fiif [ -z "$CLUSTER_NAME" ]; then echo "Cluster name not specified" exit 2fiif [ "`ls -A $namedir`" == "" ]; then echo "Formatting namenode name directory: $namedir" $HADOOP_PREFIX/bin/hdfs --config $HADOOP_CONF_DIR namenode -format $CLUSTER_NAMEfi$HADOOP_PREFIX/bin/hdfs --config $HADOOP_CONF_DIR namenode
3、build.sh文件内容
#!/bin/shdocker build -t hadoop/namenode .
4、给build.sh文件赋予可执行权限
chmod +x build.sh
5、创建镜像
sh build.sh
四、构建hadoop集群的resourcemanager镜像
1.构建hadoop集群的resourcemanager镜像需要如下文件:
.├── build.sh
├── Dockerfile
└── run.sh
2、Dockerfile文件内容如下
FROM hadoop/baseADD run.sh /run.shRUN chmod a+x /run.shCMD ["/run.sh"]
3、run.sh文件内容
#!/bin/bash$HADOOP_PREFIX/bin/yarn --config $HADOOP_CONF_DIR resourcemanager
4、build.sh文件内容
#!/bin/shdocker build -t hadoop/resourcemanager .
5、给build.sh文件赋予可执行权限
chmod +x build.sh
6、创建镜像
sh build.sh
五、构建hadoop集群的datanode镜像
1.构建hadoop集群的datanode镜像需要如下文件:
.├── build.sh
├── Dockerfile
└── run.sh
2、Dockerfile文件内容
FROM hadoop/baseENV HDFS_CONF_dfs_datanode_data_dir=file:///hadoop/dfs/dataRUN mkdir -p /hadoop/dfs/dataVOLUME /hadoop/dfs/dataADD run.sh /run.shRUN chmod a+x /run.shCMD ["/run.sh"]
3、run.sh文件内容
#!/bin/bashdatadir=`echo $HDFS_CONF_dfs_datanode_data_dir | perl -pe 's#file://##'`if [ ! -d $datadir ]; then echo "Datanode data directory not found: $dataedir" exit 2fi$HADOOP_PREFIX/bin/hdfs --config $HADOOP_CONF_DIR datanode4、build文件内容
#!/bin/shdocker build -t hadoop/datanode .
5、给build.sh文件赋予可执行权限
chmod +x build.sh
6、创建镜像
sh build.sh
六、构建hadoop集群的nodemanager镜像
1.构建hadoop集群的nodemanager镜像需要如下文件:
.├── build.sh
├── Dockerfile
└── run.sh
2、Dockerfile文件内容
FROM hadoop/baseADD run.sh /run.shRUN chmod a+x /run.shCMD ["/run.sh"]3、run.sh文件内容
#!/bin/bash$HADOOP_PREFIX/bin/yarn --config $HADOOP_CONF_DIR nodemanager
4、build.sh 文件内容
#!/bin/shdocker build -t hadoop/nodemanager .
5、给build.sh文件赋予可执行权限
chmod +x build.sh
6、创建镜像
sh build.sh
七、构建hadoop集群的historyserver镜像
1.构建hadoop集群的historyserver镜像需要如下文件:
.├── build.sh
├── Dockerfile
└── run.sh
2、Dockerfile文件内容
FROM hadoop/baseENV YARN_CONF_yarn_timeline___service_leveldb___timeline___store_path=/hadoop/yarn/timelineRUN mkdir -p /hadoop/yarn/timelineVOLUME /hadoop/yarn/timelineADD run.sh /run.shRUN chmod a+x /run.shCMD ["/run.sh"]3、run.sh文件内容
#!/bin/bash$HADOOP_PREFIX/bin/yarn --config $HADOOP_CONF_DIR historyserver4、build.sh文件内容
#!/bin/shdocker build -t hadoop/historyserver .
5、给build.sh文件赋予可执行权限
chmod +x build.sh
6、创建镜像
sh build.sh
八、通过docker-compose编排hadoop集群
1.编排hadoop集群镜像需要如下文件:
.├── docker-compose.yml
└── hadoop.env
2、docker-compose.yml文件内筒如下
version: '2'services: namenode: image: hadoop/namenode container_name: namenode hostname: namenode networks: - hadoop volumes: - hadoop_namenode:/hadoop/dfs/name environment: CLUSTER_NAME: my-cluster env_file: - ./hadoop.env resourcemanager: image: hadoop/resourcemanager container_name: resourcemanager hostname: resourcemanager depends_on: - namenode networks: - hadoop env_file: - ./hadoop.env historyserver: image: hadoop/historyserver container_name: historyserver hostname: historyserver depends_on: - namenode networks: - hadoop volumes: - hadoop_historyserver:/hadoop/yarn/timeline env_file: - ./hadoop.env nodemanager1: image: hadoop/nodemanager container_name: nodemanager1 hostname: nodemanager1 depends_on: - namenode - resourcemanager networks: - hadoop env_file: - ./hadoop.env datanode1: image: hadoop/datanode container_name: datanode1 hostname: datanode1 depends_on: - namenode networks: - hadoop volumes: - hadoop_datanode1:/hadoop/dfs/data env_file: - ./hadoop.envnetworks: hadoop: external: truevolumes: hadoop_namenode: external: true hadoop_datanode1: external: true hadoop_historyserver: external: true3、hadoop.env文件内容
#GANGLIA_HOST=ganglia.hadoopCORE_CONF_fs_defaultFS=hdfs://namenode.hadoop:8020#CORE_CONF_hadoop_http_staticuser_user=rootYARN_CONF_yarn_resourcemanager_hostname=resourcemanager.hadoopYARN_CONF_yarn_nodemanager_aux___services=mapreduce_shuffle#YARN_CONF_yarn_log___aggregation___enable=true#YARN_CONF_yarn_resourcemanager_recovery_enabled=true#YARN_CONF_yarn_resourcemanager_store_class=org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMStateStore#YARN_CONF_yarn_resourcemanager_fs_state___store_uri=/rmstate#YARN_CONF_yarn_nodemanager_remote___app___log___dir=/app-logs#YARN_CONF_yarn_log_server_url=http://historyserver.hadoop:8188/applicationhistory/logs/#YARN_CONF_yarn_timeline___service_enabled=true#YARN_CONF_yarn_timeline___service_generic___application___history_enabled=true#YARN_CONF_yarn_resourcemanager_system___metrics___publisher_enabled=trueYARN_CONF_yarn_resourcemanager_hostname=resourcemanager.hadoopYARN_CONF_yarn_timeline___service_hostname=historyserver.hadoopHDFS_CONF_dfs_namenode_secondary_http___address=namenode.hadoop:50090HDFS_CONF_dfs_replication=24、执行如下命令
docker-compose up
5、通过docker ps查看集群镜像启动结果
docker@dockertest2:~/nopublicimage/hadoop_compose$ docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMESe1eba51e4459 hadoop/nodemanager "/entrypoint.sh /run." 43 minutes ago Up 43 minutes nodemanager1e7c7ece55ee6 hadoop/resourcemanager "/entrypoint.sh /run." 43 minutes ago Up 43 minutes resourcemanagereb13c1a60c81 hadoop/datanode "/entrypoint.sh /run." 43 minutes ago Up 43 minutes datanode1583f5fe27119 hadoop/historyserver "/entrypoint.sh /run." 43 minutes ago Up 43 minutes historyserver3dfc276dafdf hadoop/namenode "/entrypoint.sh /run." 43 minutes ago Up 43 minutes namenode831b00a867f7 aaef3f5ef5d4 "/bin/sh -c 'apt-get " About an hour ago Up About an hour gigantic_minsky
6、通过docker exec -ti namenode 进入容器查看集群是否正常启动
docker@dockertest2:~/nopublicimage/hadoop_compose$ docker exec -ti namenode /bin/bashroot@namenode:/# hdfs dfsadmin -reportConfigured Capacity: 29458821120 (27.44 GB)Present Capacity: 5402902528 (5.03 GB)DFS Remaining: 5402873856 (5.03 GB)DFS Used: 28672 (28 KB)DFS Used%: 0.00%Under replicated blocks: 0Blocks with corrupt replicas: 0Missing blocks: 0Missing blocks (with replication factor 1): 0-------------------------------------------------Live datanodes (1):Name: 172.19.0.5:50010 (datanode1.hadoop)Hostname: datanode1Decommission Status : NormalConfigured Capacity: 29458821120 (27.44 GB)DFS Used: 28672 (28 KB)Non DFS Used: 24055918592 (22.40 GB)DFS Remaining: 5402873856 (5.03 GB)DFS Used%: 0.00%DFS Remaining%: 18.34%Configured Cache Capacity: 0 (0 B)Cache Used: 0 (0 B)Cache Remaining: 0 (0 B)Cache Used%: 100.00%Cache Remaining%: 0.00%Xceivers: 1Last contact: Sat Oct 29 01:50:53 UTC 2016
九、实验结束
- docker镜像制作之dockercompose.yml文件---hadoop伪分布式
- docker镜像制作之Dockerfile文件---hadooop伪分布式
- hadoop之docker伪分布式部署
- docker之镜像制作
- docker镜像制作之Dockerfile文件---snort
- hadoop之伪分布式
- Hadoop之伪分布式
- 制作mongodb的Docker镜像文件
- docker镜像制作之---oracle
- Docker镜像制作
- 制作Docker镜像
- docker制作镜像
- Docker镜像制作
- Hadoop之伪分布式配置
- hadoop之伪分布式模式
- 在docker环境中制作openstack镜像文件
- 制作各种docker镜像
- dockerfile制作docker镜像
- Struts2学习笔记(5)
- Java包(package)的命名规范,java中package命名规则
- ubuntu14.04下安装配置Qt4.8.6
- Java 易错知识点
- 我们是一群坚定的青年,从未向时间屈服
- docker镜像制作之dockercompose.yml文件---hadoop伪分布式
- DB2 check table definition
- 读书笔记-LinuxShell编程与服务器管理-part1
- DTrace for Linux 2016
- Python抓取页面
- android 低功耗BLE蓝牙连接示例代码
- java 序列化和反序列化详解
- Linux批量重命名文件方法
- Oracle 11g Create Logical Standby