在Linux单机上运行Hadoop-0.20.0实例
来源:互联网 发布:吉他记谱软件 编辑:程序博客网 时间:2024/05/11 15:53
其实,Hadoop-0.20.0与Hadoop-0.19.0的入门运行非常相似,基本步骤都是相同的。不同的是:Hadoop-0.19.0的配置文件hadoop-site.xml中内容,在Hadoop-0.20.0的配置中进行了拆分,分别放在三个配置文件中,如下:
1、core-site.xml配置文件
内容配置如下所示:
2、hdfs-site.xml配置文件
内容配置如下所示:
3、mapred-site.xml配置文件
配置内容如下所示:
除了这一处不同之外,运行的基本过程与Hadoop-0.19.0相同,可以非常容易地运行wordcount实例。
这里,主要是讨论几点,在运行Hadoop-0.20.0例子的过程,出现的几个异常,及其问题出现的原因和解决方法。
异常分析
1、“could only be replicated to 0 nodes, instead of 1”异常
(1)异常描述
上面配置都正确无误,并且,已经完成了如下运行步骤:
[root@localhost hadoop-0.20.0]# bin/hadoop namenode -format
[root@localhost hadoop-0.20.0]# bin/start-all.sh
这时,看到5个进程jobtracker、tasktracker、namenode、datanode、secondarynamenode已经给出了启动成功信息,但是运行jps命令查看进程的时候,发现并不是那样,如下所示:
4281 Jps
4007 SecondaryNameNode
3771 NameNode
可见,只有两个进程启动成功了,其它的并没有成功,如果你再继续向下执行,准备运行wordcount实例之前执行上传文件的命令:
[root@localhost hadoop-0.20.0]# bin/hadoop fs -put input in
现在就会抛出一堆异常了,如下所示:
10/08/02 15:36:04 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/root/in/LICENSE.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:739) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy0.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy0.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2873) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2755) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)10/08/02 15:36:04 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /user/root/in/LICENSE.txt retries left 410/08/02 15:36:04 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/root/in/LICENSE.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:739) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy0.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy0.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2873) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2755) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)10/08/02 15:36:04 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /user/root/in/LICENSE.txt retries left 310/08/02 15:36:05 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/root/in/LICENSE.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:739) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy0.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy0.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2873) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2755) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)10/08/02 15:36:05 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /user/root/in/LICENSE.txt retries left 210/08/02 15:36:07 INFO hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/root/in/LICENSE.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:739) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy0.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy0.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2873) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2755) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)10/08/02 15:36:07 WARN hdfs.DFSClient: NotReplicatedYetException sleeping /user/root/in/LICENSE.txt retries left 110/08/02 15:36:10 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/root/in/LICENSE.txt could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953) at org.apache.hadoop.ipc.Client.call(Client.java:739) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy0.addBlock(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59) at $Proxy0.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2873) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2755) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)10/08/02 15:36:10 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null10/08/02 15:36:10 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/root/in/LICENSE.txt" - Aborting...put: java.io.IOException: File /user/root/in/LICENSE.txt could only be replicated to 0 nodes, instead of 1
看到“could only be replicated to 0 nodes, instead of 1”这句信息的时候,你可能会首先想到是否hdfs-site.xml配置文件中的属性dfs.replication配置错误,但事实上并不是这样。
这时,就要查看启动日志了,我的是位于/root/hadoop-0.20.0/logs下面,如下所示:
hadoop-root-datanode-localhost.log hadoop-root-namenode-localhost.log hadoop-root-tasktracker-localhost.log
hadoop-root-datanode-localhost.out hadoop-root-namenode-localhost.out hadoop-root-tasktracker-localhost.out
hadoop-root-jobtracker-localhost.log hadoop-root-secondarynamenode-localhost.log history
hadoop-root-jobtracker-localhost.out hadoop-root-secondarynamenode-localhost.out
查看hadoop-root-datanode-localhost.log日志文件,看到异常信息:
2010-08-02 15:38:34,642 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: /************************************************************STARTUP_MSG: Starting DataNodeSTARTUP_MSG: host = localhost/127.0.0.1STARTUP_MSG: args = []STARTUP_MSG: version = 0.20.0STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009************************************************************/2010-08-02 15:38:35,381 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /tmp/hadoop-root/dfs/data: namenode namespaceID = 409052671; datanode namespaceID = 769845957 at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148) at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298) at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:216) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368)2010-08-02 15:38:35,382 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down DataNode at localhost/127.0.0.1************************************************************/
通过上面的信息,大致可以了解到,通过“Incompatible namespaceIDs in /tmp/hadoop-root/dfs/data”可知,是由于 /tmp/hadoop-root/dfs/data中的namespaceIDs不兼容导致的,也就是说,很可能是由于上次运行其它版本的Hadoop在/tmp/hadoop-root/dfs/data目录下有残留的不兼容的数据。事实上在我运行过程中出现这个问题就是由于,刚刚尝试了Hadoop-0.19.0版本的运行,运行后并没有清理这些数据。
(2)解决方法
清理对应目录的数据以后,就可以正常运行了,这时执行启动各个进程之后,通过jps命令可以查看到结果如下所示:
5386 JobTracker
5253 DataNode
5529 Jps
4874 SecondaryNameNode
5489 TaskTracker
4649 NameNode
上面5个进程都启动起来了,可以上传文件到HDFS,并执行wordcount例子。
- 在Linux单机上运行Hadoop-0.20.0实例
- 在Linux单机上运行Hadoop-0.19.0实例
- 在Linux单机上运行Hadoop-0.19.0实例
- 在Linux单机上运行Hadoop-0.19.0实例http://blog.csdn.net/shirdrn/article/details/5781776
- 在Linux系统设置共享文件夹、Hadoop单机/伪分布部署,运行Hadoop Wordcount单词统计实例
- Hadoop 在linux 单机上伪分布式 的安装
- Hadoop 在Linux 单机上伪分布式 的安装过程
- linux单机上安装hadoop
- Hadoop + eclipse + linux 单机运行 WordCount
- Hadoop在Ubuntu9.10上以单机(非分布式)模式运行
- 用Linux Container在单机上部署完全分布式的Hadoop集群
- Ubuntu上搭建单机版Hadoop运行环境
- 在Linux系统上部署Hadoop运行环境
- 在linux上部署web项目并运行(单机测试版)
- spark学习1——配置hadoop 单机模式并运行WordCount实例(ubuntu14.04 & hadoop 2.6.0)
- python在hadoop上运行
- Hadoop 在Ubuntu下的单机配置及运行示例
- Mac上运行第一个Hadoop实例
- Challenges in Lean model
- MFC绘图
- EJB到底是什么,真的那么神秘吗??
- Java中的字符串比较相等与大小
- 8月2日
- 在Linux单机上运行Hadoop-0.20.0实例
- 为何要用_beginthreadex()而非CreateThread?
- 自动化测试之祸
- 施振荣:认输的态度很重要
- 面试官眼中的人才
- HDU 2842(数论,构造矩阵)
- 这两个月来没有坚持住,“坚持”是一个很大的问题
- Michael Peppler > DBD-Sybase-1.10 > DBD::Sybase
- 第一次用拓扑,记念一下~hdu 1811