搭建Hadoop VM集群
来源:互联网 发布:易语言cf强登源码 编辑:程序博客网 时间:2024/05/06 11:45
之前一直在单节点psudo-distributed模式下跑,觉得有必要试试搭建一个真正的distributed环境。期间参考了不少文章,这里做个小笔记。
VM创建
这里利用VMware创建一台VM做master,再链接克隆出一台VM做slave。碰到主要的问题有:1. 如何设置VM网络连接 2. 链接克隆slave VM后,如何修复网络配置。这已经在另外一篇文章中做了笔记。
master192.168.15.131slave192.168.15.132
配置ssh无密码登录
每次启动master(运行bin/start-all.sh)时,创建每个deamon进程都提示输入密码,挺烦人的,最好配置ssh以无密码登录:
a. 对于namenode和jobtracker,需要配置master本身的ssh
yum install openssh-server.x86_64ssh-keygencat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keysb. 对于datanode和tasktracker,需要配置slave的ssh
scp root@192.168.15.131:~/.ssh/autorized_keys .cat ./autorized_keys >> ~/.ssh/authorized_keys
Hadoop节点配置
需保证master和slave上的配置保持一致.
conf/core-site.xml:
<configuration> <property> <name>fs.default.name</name> <value>hdfs://192.168.15.131:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop-1.0.4/cache/hadoop</value> </property></configuration>conf/hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.name.dir</name> <value>/home/hadoop-1.0.4/cache/hadoop/dfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>/home/hadoop-1.0.4/cache/hadoop/dfs/data</value> </property></configuration>conf/mapred-site.xml:
<configuration> <property> <name>mapred.local.dir</name> <value>/home/hadoop-1.0.4/cache/root</value> </property> <property> <name>mapred.job.tracker</name> <value>192.168.15.131:9001</value> </property></configuration>conf/master:
192.168.15.131conf/slave:
192.168.15.132
conf/hadoop-env:
export JAVA_HOME=/usr/lib/jvm/jre-1.6.0-openjdk.x86_64/
hadoop namenode -format
接下来就可以启动bin/start.all.sh,接着用jps检查master和slave上的java process,确保5个deamon进程正常启动:
yum install java-1.6.0-openjdk-devel.x86_64jps
常见问题
当然,在启动节点的过程中难免会碰到出错,这里记录下我所碰到的问题。
问题1: 在slave上用jps检查发现,datanode和tasktracker up之后,很快down掉了。检查log,发现如下exception:
2013-09-21 12:19:05,240 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Call to 192.168.15.131/192.168.15.131:8021 failed on local exception: java.net.NoRouteToHostException: No route to host2013-09-21 12:18:56,940 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to 192.168.15.131/192.168.15.131:8020 failed on local exception: java.net.NoRouteToHostException: No route to host
怀疑是master的firewall block了namenode和jobtracker的port。后证明如下:
yum install nmap.x86_64
nmap -p 9000 -sT 192.168.15.131Starting Nmap 5.51 ( http://nmap.org ) at 2013-09-21 12:50 EDTNmap scan report for 192.168.15.131Host is up (0.00047s latency).PORT STATE SERVICE9000/tcp filtered cslistener
看到master的9000端口被firewall给filter掉了。解决办法是在master上打开端口:iptables -I INPUT -p tcp --dport 9000 --syn -j ACCEPTiptables -I INPUT -p tcp --dport 9001 --syn -j ACCEPTservice iptables save之后,datanode和tasktracker可正常启动了。
问题2:虽然datanode正常启动了,但当用hadoop fs -put时,又报错,查看namenode log发现如下exception:
13/09/21 13:08:34 INFO hdfs.DFSClient: Exception in createBlockOutputStream 192.168.15.132:50010 java.net.NoRouteToHostException: No route to host13/09/21 13:08:34 INFO hdfs.DFSClient: Abandoning block blk_-4068253165924827530_100113/09/21 13:08:34 INFO hdfs.DFSClient: Excluding datanode 192.168.15.132:5001013/09/21 13:08:34 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /junz/in/movie1 could only be replicated to 0 nodes, instead of 1显然,这次是slave的firewall block了50010端口,需要以上相同的方式allow该端口,但考虑到hadoop还会用到其他多个端口,如50030,50070,50060等,干脆将master和slave上的firewall给禁止掉:
service iptables stop
- 搭建Hadoop VM集群
- VM下搭建hadoop集群
- VM+CentOS+hadoop2.7搭建hadoop完全分布式集群
- 搭建VM的虚拟机下hadoop集群环境
- 搭建hadoop完全分布式集群VM+CentOS+hadoop2.7
- hadoop集群搭建(hadoop)
- HADOOP: 搭建hadoop集群
- vm下ubuntu集群安装hadoop
- centos+VM 安装集群式hadoop
- hadoop 2.8 centos vm 集群配置
- hadoop集群搭建
- Hadoop集群的搭建
- Hadoop集群搭建
- Hadoop集群搭建
- 搭建hadoop集群
- Nutch+Hadoop集群搭建
- Hadoop集群搭建
- Hadoop集群搭建
- c++ 顺序容器 笔记
- Deeplearning学习之路_第零章(序言)
- 基于C/S架构的Trie树查询
- Fedora19中Firewalld与iptables的影响
- xml读写总结
- 搭建Hadoop VM集群
- 5 假设估计 basically empty
- listview
- 手机号验证
- 【AA】回归学习总结
- 把第三方DLL的源代码引入到项目中
- GoLang及Sublime Text 2之Mac OS X 10.8.4开发环境安装
- 毕业生求职之我见
- android基础--handler与线程