Start hadoop, hive, hue server in the virtual hadoop clusters based on docker

来源：互联网发布：手机优酷无网络连接编辑：程序博客网时间：2024/06/06 12:48

######start hadoop####### refer to "http://tashan10.com/yong-dockerda-jian-hadoopwei-fen-bu-shi-ji-qun/"

1.open three terminals and execute one of the following three commands respectively for each terminal to start three containers from the image at first time.
1)docker run -ti -h master --name master centos:hadoop_hive
2)docker run -ti -h slave1 --name slave1 centos:hadoop_hive
3)docker run -ti -h slave2 --name slave2 centos:hadoop_hive

if it is not the first time to start the three node container, we can execute "docker ps -a" to see the name of those three container:
#####################################################################################################################################
datahub@datahub:~$ docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
beefe2d6832c centos:hadoop_hive "/bin/bash" 2 days ago Up 2 days slave2
ff3aa0faed1f centos:hadoop_hive "/bin/bash" 2 days ago Up 2 days slave1
5738358d2048 centos:hadoop_hive "/bin/bash" 2 days ago Up 2 days master
5166f7fdead5 centos:Hue "/bin/bash" 2 days ago Up 2 days hue_server
#####################################################################################################################################

1)docker start master docker attach master
2)docker start slave1 docker attach slave1
3)docker start slave2 docker attach slave2

notice:这里有几个问题：

(1)Docker容器中的ip地址是启动之后自动分配的，且不能手动更改
(2)hostname、hosts配置在容器内修改了，只能在本次容器生命周期内有效。如果容器退出了，重新启动，这两个配置将被还原。且这两个配置无法通过commit命令写入镜像

2.配置hosts

1.通过ifconfig命令获取各节点ip。环境不同获取的ip可能不一样，例如我本机获取的ip如下：
master:172.17.0.2
slave1:172.17.0.6
slave2:172.17.0.7

使用nano /etc/hosts命令将如下配置写入各节点的hosts文件，注意修改ip地址：

172.17.0.2 master
172.17.0.6 slave1
172.17.0.7 slave2

3.配置slaves

(1)在master节点容器中执行如下命令：
root@master:~# cd $HADOOP_CONFIG_HOME/
root@master:~/soft/apache/hadoop/hadoop-2.6.0/etc/hadoop# nano slaves

(2)将如下slave节点的hostname信息写入该文件：
slave1
slave2

(3)启动Hadoop
在master节点上执行start-all.sh命令，启动Hadoop

4.check the hadoop run state
在each node上执行jps命令，结果如下：
master节点:

root@master:~/soft/apache/hadoop/hadoop-2.6.0/etc/hadoop# jps
1223 Jps
992 SecondaryNameNode
813 NameNode
1140 ResourceManager

slave1节点:

root@slave1:~/soft/apache/hadoop/hadoop-2.6.0/etc/hadoop# jps
258 NodeManager
352 Jps
159 DataNode

slave2节点:

root@slave2:~/soft/apache/hadoop/hadoop-2.6.0/etc/hadoop# jps
371 Jps
277 NodeManager
178 DataNode

在master节点上通过命令hdfs dfsadmin -report查看DataNode是否正常启动

######start hive######(all of these steps below are done in master node)
we choose mysql as the metastore of hive(username:hive1,password:hive1)

after we finish start hadoop:
1.[root@master hadoop]# service mysqld start

2.source /etc/profile
(we have add the hive path into /etc/profile,but we still need to execute "source /etc/profile" to update the
state and then we can use the command "hive", else the hive command will call the error "hive command not find")

3.start hive metastore service in the backend:
[root@master hadoop]# hive --service metastore 2>&1 >> /var/log.log &

###because we start metastore service and hive client both in master, so we should start metastore service on backend,then we can start hive client in the shell.Use this command:hive --service metastore 2>&1 >> /var/log.log &###

4.start hive client:
[root@master hadoop]# hive

5.Ctrl+C quit hive client,and start hiverserver2 for hueserver to connect:
[root@master hadoop]# hive --service hiverserver2 2>&1 >> /var/log.log &

#notice#:for Step 3 and 4, there maybe some information like "no hbase" or "SLF4J" like the following details:
####################################################
[root@master hadoop]# hive --service metastore 2>&1 >> /var/log.log &
[1] 1032
[root@master hadoop]# which: no hbase in (/opt/jdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/soft/apache/hadoop/hadoop-2.7.2/bin:/root/soft/apache/hadoop/hadoop-2.7.2/sbin:/root/soft/apache/hadoop/hadoop-2.7.2/bin:/root/soft/apache/hadoop/hadoop-2.7.2/sbin:/usr/local/apache-hive-2.0.1-bin/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.0.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/soft/apache/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

[root@master hadoop]# hive
which: no hbase in (/opt/jdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/soft/apache/hadoop/hadoop-2.7.2/bin:/root/soft/apache/hadoop/hadoop-2.7.2/sbin:/root/soft/apache/hadoop/hadoop-2.7.2/bin:/root/soft/apache/hadoop/hadoop-2.7.2/sbin:/usr/local/apache-hive-2.0.1-bin/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.0.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/soft/apache/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]

Logging initialized using configuration in file:/usr/local/apache-hive-2.0.1-bin/conf/hive-log4j2.properties
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive>
####################################################
#it is OK, so do not be worried!Our purpose is the "hive>".

if we visit "http://[ip of master]:50070" in the explorer, we can see some hdfs information on web.

# some records #:
some command used usually
1.docker cp /home/datahub/hive-site.xml 9168dfad0358:/usr/local/apache-hive-2.0.1-bin/conf
2.docker commit -m "modify hive-site.xml" 9168dfad0358 centos:hadoop_hive
3.docker ps #see the detail information of running container
4.docker run -ti -v /home/datahub/Hue:/root/soft/hue_project/Hue --name hue_server centos:Hue /bin/bash
####important####
running a new container, meanwhile we set the container name(--name),and upload the dir of host to docker container(-v)
####important####
## /home/datahub/Hue is the dir of host,/root/soft/hue_project/Hue is the dir of container
5.sudo docker run -itd --name hue_server centos:Hue /bin/bash #创建一个守护态的Docker容器
docker attach hue_server #connect a running container "docker exec -ti hue_server /bin/bash" is ok, too.
6.sudo docker run hue_server #run a stop bue exist container by name/container_id
7.Remove all stopped containers: docker rm $(docker ps -a -q)
8."ctrl+p" and then "ctrl+q" to quit container with the container still running.
9.docker exec -ti hue_server /bin/bash #open another one terminal of a running container.

#######start hue#########
1.(1)start from a new commit image and we mounted the host dir:docker run -ti -v /home/datahub/Hue:/root/soft/hue_project/Hue --name hue_server centos:Hue /bin/bash
(2)start from a exist container by name/ID:
$sudo docker start hue_server
$docker attach hue_server
2.if mysql is stopped: service mysqld start
3.go to ${HUE_HOME}:
cd /root/soft/hue_project/Hue

4.build/env/bin/hue runserver 0.0.0.0:80

5.http://172.17.0.5/

阅读全文

0 0