搭建Hadoop VM集群

来源:互联网 发布:易语言cf强登源码 编辑:程序博客网 时间:2024/05/06 11:45

之前一直在单节点psudo-distributed模式下跑,觉得有必要试试搭建一个真正的distributed环境。期间参考了不少文章,这里做个小笔记。

VM创建

这里利用VMware创建一台VM做master,再链接克隆出一台VM做slave。碰到主要的问题有:1. 如何设置VM网络连接 2. 链接克隆slave VM后,如何修复网络配置。这已经在另外一篇文章中做了笔记。
master192.168.15.131slave192.168.15.132


配置ssh无密码登录

每次启动master(运行bin/start-all.sh)时,创建每个deamon进程都提示输入密码,挺烦人的,最好配置ssh以无密码登录:
a. 对于namenode和jobtracker,需要配置master本身的ssh
yum install openssh-server.x86_64ssh-keygencat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
b. 对于datanode和tasktracker,需要配置slave的ssh
scp root@192.168.15.131:~/.ssh/autorized_keys .cat ./autorized_keys >> ~/.ssh/authorized_keys


Hadoop节点配置

需保证master和slave上的配置保持一致.
conf/core-site.xml:
<configuration>  <property>    <name>fs.default.name</name>    <value>hdfs://192.168.15.131:9000</value>  </property>  <property>     <name>hadoop.tmp.dir</name>     <value>/home/hadoop-1.0.4/cache/hadoop</value>  </property></configuration>
conf/hdfs-site.xml
<configuration>  <property>    <name>dfs.replication</name>    <value>1</value>  </property>  <property>     <name>dfs.permissions</name>     <value>false</value>  </property>  <property>     <name>dfs.name.dir</name>     <value>/home/hadoop-1.0.4/cache/hadoop/dfs/name</value>  </property>  <property>     <name>dfs.data.dir</name>     <value>/home/hadoop-1.0.4/cache/hadoop/dfs/data</value>  </property></configuration>
conf/mapred-site.xml:
<configuration>  <property>    <name>mapred.local.dir</name>    <value>/home/hadoop-1.0.4/cache/root</value>  </property>  <property>    <name>mapred.job.tracker</name>    <value>192.168.15.131:9001</value>  </property></configuration>
conf/master:
192.168.15.131
conf/slave:
192.168.15.132
conf/hadoop-env:
export JAVA_HOME=/usr/lib/jvm/jre-1.6.0-openjdk.x86_64/

配置好后,格式化namenode:
hadoop namenode -format
接下来就可以启动bin/start.all.sh,接着用jps检查master和slave上的java process,确保5个deamon进程正常启动:
yum install java-1.6.0-openjdk-devel.x86_64jps


常见问题

当然,在启动节点的过程中难免会碰到出错,这里记录下我所碰到的问题。

问题1: 在slave上用jps检查发现,datanode和tasktracker up之后,很快down掉了。检查log,发现如下exception:

2013-09-21 12:19:05,240 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Call to 192.168.15.131/192.168.15.131:8021 failed on local exception: java.net.NoRouteToHostException: No route to host2013-09-21 12:18:56,940 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to 192.168.15.131/192.168.15.131:8020 failed on local exception: java.net.NoRouteToHostException: No route to host

怀疑是master的firewall block了namenode和jobtracker的port。后证明如下:
yum install nmap.x86_64
nmap -p 9000 -sT 192.168.15.131Starting Nmap 5.51 ( http://nmap.org ) at 2013-09-21 12:50 EDTNmap scan report for 192.168.15.131Host is up (0.00047s latency).PORT     STATE    SERVICE9000/tcp filtered cslistener
看到master的9000端口被firewall给filter掉了。解决办法是在master上打开端口:
iptables -I INPUT -p tcp --dport 9000 --syn -j ACCEPTiptables -I INPUT -p tcp --dport 9001 --syn -j ACCEPTservice iptables save
之后,datanode和tasktracker可正常启动了。

问题2:虽然datanode正常启动了,但当用hadoop fs -put时,又报错,查看namenode log发现如下exception:
13/09/21 13:08:34 INFO hdfs.DFSClient: Exception in createBlockOutputStream 192.168.15.132:50010 java.net.NoRouteToHostException: No route to host13/09/21 13:08:34 INFO hdfs.DFSClient: Abandoning block blk_-4068253165924827530_100113/09/21 13:08:34 INFO hdfs.DFSClient: Excluding datanode 192.168.15.132:5001013/09/21 13:08:34 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /junz/in/movie1 could only be replicated to 0 nodes, instead of 1
显然,这次是slave的firewall block了50010端口,需要以上相同的方式allow该端口,但考虑到hadoop还会用到其他多个端口,如50030,50070,50060等,干脆将master和slave上的firewall给禁止掉:
service iptables stop




原创粉丝点击