Start hadoop, hive, hue server in the virtual hadoop clusters based on docker

来源:互联网 发布:手机优酷无网络连接 编辑:程序博客网 时间:2024/06/06 12:48
######start hadoop####### refer to "http://tashan10.com/yong-dockerda-jian-hadoopwei-fen-bu-shi-ji-qun/"


1.open three terminals and execute one of the following three commands respectively for each terminal to start three containers from the image at first time.
1)docker run -ti -h master --name master centos:hadoop_hive
2)docker run -ti -h slave1 --name slave1 centos:hadoop_hive
3)docker run -ti -h slave2 --name slave2 centos:hadoop_hive


if it is not the first time to start the three node container, we can execute "docker ps -a" to see the name of those three container:
#####################################################################################################################################
datahub@datahub:~$ docker ps -a
CONTAINER ID        IMAGE                COMMAND             CREATED             STATUS              PORTS               NAMES
beefe2d6832c        centos:hadoop_hive   "/bin/bash"         2 days ago          Up 2 days                               slave2
ff3aa0faed1f        centos:hadoop_hive   "/bin/bash"         2 days ago          Up 2 days                               slave1
5738358d2048        centos:hadoop_hive   "/bin/bash"         2 days ago          Up 2 days                               master
5166f7fdead5        centos:Hue           "/bin/bash"         2 days ago          Up 2 days                               hue_server
#####################################################################################################################################


1)docker start master  docker attach master
2)docker start slave1  docker attach slave1
3)docker start slave2  docker attach slave2




notice:这里有几个问题:


    (1)Docker容器中的ip地址是启动之后自动分配的,且不能手动更改
    (2)hostname、hosts配置在容器内修改了,只能在本次容器生命周期内有效。如果容器退出了,重新启动,这两个配置将被还原。且这两个配置无法通过commit命令写入镜像




2.配置hosts


    1.通过ifconfig命令获取各节点ip。环境不同获取的ip可能不一样,例如我本机获取的ip如下:
        master:172.17.0.2
        slave1:172.17.0.6
        slave2:172.17.0.7


    使用nano /etc/hosts命令将如下配置写入各节点的hosts文件,注意修改ip地址:


    172.17.0.2        master
    172.17.0.6        slave1
    172.17.0.7        slave2


3.配置slaves


(1)在master节点容器中执行如下命令:
root@master:~# cd $HADOOP_CONFIG_HOME/
root@master:~/soft/apache/hadoop/hadoop-2.6.0/etc/hadoop# nano slaves 


(2)将如下slave节点的hostname信息写入该文件:
slave1
slave2


(3)启动Hadoop
在master节点上执行start-all.sh命令,启动Hadoop




4.check the hadoop run state
在each node上执行jps命令,结果如下:
master节点:


root@master:~/soft/apache/hadoop/hadoop-2.6.0/etc/hadoop# jps
1223 Jps
992 SecondaryNameNode
813 NameNode
1140 ResourceManager


slave1节点:


root@slave1:~/soft/apache/hadoop/hadoop-2.6.0/etc/hadoop# jps
258 NodeManager
352 Jps
159 DataNode


slave2节点:


root@slave2:~/soft/apache/hadoop/hadoop-2.6.0/etc/hadoop# jps
371 Jps
277 NodeManager
178 DataNode


在master节点上通过命令hdfs dfsadmin -report查看DataNode是否正常启动






######start hive######(all of these steps below are done in master node)
we choose mysql as the metastore of hive(username:hive1,password:hive1)


after we finish start hadoop:
1.[root@master hadoop]# service mysqld start


2.source /etc/profile 
(we have add the hive path into /etc/profile,but we still need to execute "source /etc/profile" to update the
state and then we can use the command "hive", else the hive command will call the error "hive command not find")








3.start hive metastore service in the backend:
[root@master hadoop]# hive --service metastore 2>&1 >> /var/log.log &


###because we start metastore service and hive client both in master, so we should start metastore service on backend,then we can start hive client in the shell.Use this command:hive --service metastore 2>&1 >> /var/log.log &###


4.start hive client:
[root@master hadoop]# hive




5.Ctrl+C quit hive client,and start hiverserver2 for hueserver to connect:
[root@master hadoop]# hive --service hiverserver2 2>&1 >> /var/log.log &






#notice#:for Step 3 and 4, there maybe some information like "no hbase" or "SLF4J" like the following details:
####################################################
[root@master hadoop]# hive --service metastore 2>&1 >> /var/log.log &
[1] 1032
[root@master hadoop]# which: no hbase in (/opt/jdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/soft/apache/hadoop/hadoop-2.7.2/bin:/root/soft/apache/hadoop/hadoop-2.7.2/sbin:/root/soft/apache/hadoop/hadoop-2.7.2/bin:/root/soft/apache/hadoop/hadoop-2.7.2/sbin:/usr/local/apache-hive-2.0.1-bin/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.0.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/soft/apache/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]


[root@master hadoop]# hive
which: no hbase in (/opt/jdk/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/root/soft/apache/hadoop/hadoop-2.7.2/bin:/root/soft/apache/hadoop/hadoop-2.7.2/sbin:/root/soft/apache/hadoop/hadoop-2.7.2/bin:/root/soft/apache/hadoop/hadoop-2.7.2/sbin:/usr/local/apache-hive-2.0.1-bin/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/apache-hive-2.0.1-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/root/soft/apache/hadoop/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]


Logging initialized using configuration in file:/usr/local/apache-hive-2.0.1-bin/conf/hive-log4j2.properties
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
hive> 
####################################################
#it is OK, so do not be worried!Our purpose is the "hive>".


if we visit "http://[ip of master]:50070" in the explorer, we can see some hdfs information on web.






# some records #:
some command used usually
1.docker cp /home/datahub/hive-site.xml 9168dfad0358:/usr/local/apache-hive-2.0.1-bin/conf
2.docker commit -m "modify hive-site.xml" 9168dfad0358  centos:hadoop_hive
3.docker ps                #see the detail information of running container
4.docker run -ti -v /home/datahub/Hue:/root/soft/hue_project/Hue --name hue_server centos:Hue /bin/bash       
####important#### 
running a new container, meanwhile we set the container name(--name),and upload the dir of host to docker container(-v)
####important####
## /home/datahub/Hue is the dir of host,/root/soft/hue_project/Hue is the dir of container
5.sudo docker run -itd --name hue_server centos:Hue /bin/bash   #创建一个守护态的Docker容器
docker attach hue_server    #connect a running container "docker exec -ti hue_server /bin/bash" is ok, too. 
6.sudo docker run hue_server   #run a stop bue exist container by name/container_id
7.Remove all stopped containers:  docker rm $(docker ps -a -q)
8."ctrl+p" and then "ctrl+q" to quit container with the container still running. 
9.docker exec -ti hue_server /bin/bash  #open another one terminal of a running container.






#######start hue#########
1.(1)start from a new commit image and we mounted the host dir:docker run -ti -v /home/datahub/Hue:/root/soft/hue_project/Hue --name hue_server centos:Hue /bin/bash
  (2)start from a exist container by name/ID:
     $sudo docker start hue_server
     $docker attach hue_server 
2.if mysql is stopped: service mysqld start
3.go to ${HUE_HOME}:
  cd /root/soft/hue_project/Hue

4.build/env/bin/hue runserver 0.0.0.0:80

5.http://172.17.0.5/

  







原创粉丝点击