Hadoop Installation on a slow ubuntu laptop - Problem of the in-accessible "http://localhost:50030/jobtracker.jsp"

来源：互联网发布：圆方家居设计软件3.0 编辑：程序博客网时间：2024/05/16 13:52

To install Hadoop-0.20 on my Ubuntu laptop, I started hadoop as Pesudo-distribution mode. I used thequickstart information for core-site, hdfs-site and mapred-site.

A strange issue bothered me always. The http://localhost:50030/jobtracker.jsp would show thejobtracker process for the first, and if I clicked any other on the webpage, the website crashed, saying the connection was refused. I checked the logs about the JabTracker.

For the fisrt time when the web is accessible, all the log was as follows:

************************************************************/
2009-08-02 02:53:35,126 INFO org.apache.hadoop.mapred.JobTracker: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting JobTracker
STARTUP_MSG:   host = serina/127.0.1.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.0
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/core/branches/branch-0.20 -r 763504; compiled by 'ndaley' on Thu Apr 9 05:18:40 UTC 2009
************************************************************/
2009-08-02 02:53:35,314 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=JobTracker, port=9001
2009-08-02 02:53:40,467 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2009-08-02 02:53:40,655 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50030
2009-08-02 02:53:40,656 INFO org.mortbay.log: jetty-6.1.14
2009-08-02 02:53:40,713 WARN org.mortbay.log: Can't reuse /tmp/Jetty_0_0_0_0_50030_job____yn7qmk, using /tmp/Jetty_0_0_0_0_50030_job____yn7qmk_357387917465878562
2009-08-02 02:53:42,770 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:50030

And sometimes, something more was added, like these:

handler 2 on 58412: starting
2009-08-02 02:29:46,588 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 58412: starting
2009-08-02 02:29:46,589 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:58412
2009-08-02 02:29:46,589 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_serina:localhost/127.0.0.1:58412
2009-08-02 02:29:52,308 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.io.IOException: Call to localhost/127.0.0.1:11111 failed on local exception: java.io.IOException: Connection reset by peer

What followed these logs were the logs recorded when the connection refusion problems occured

2009-08-02 02:53:42,773 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
2009-08-02 02:53:42,773 INFO org.apache.hadoop.mapred.JobTracker: JobTracker up at: 9001
2009-08-02 02:53:42,773 INFO org.apache.hadoop.mapred.JobTracker: JobTracker webserver: 50030
2009-08-02 02:53:43,040 INFO org.apache.hadoop.mapred.JobTracker: Cleaning up the system directory
2009-08-02 02:53:43,110 INFO org.apache.hadoop.hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-heavy/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

    at org.apache.hadoop.ipc.Client.call(Client.java:739)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
    at $Proxy4.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
    at $Proxy4.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2873)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2755)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2046)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2232)

2009-08-02 02:53:43,110 WARN org.apache.hadoop.hdfs.DFSClient: NotReplicatedYetException sleeping /tmp/hadoop-heavy/mapred/system/jobtracker.info retries left 4
2009-08-02 02:53:43,516 INFO org.apache.hadoop.hdfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /tmp/hadoop-heavy/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1256)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)

......

I chekced the internet. The only reasonable solution to this problem is to reset the port of the website, because sometimes there is also "Address areadly in Use" logged. However, this problem would repeat if I changed the port, even though the first time I may start the website correctly.

Then I found the this bug. Cloudera's page says that the commands "start-all" and "stop-all" are deprecated. Start-all is like this:

bin=`dirname "$0"`
bin=`cd "$bin"; pwd`

. "$bin"/hadoop-config.sh

# start dfs daemons
"$bin"/start-dfs.sh --config $HADOOP_CONF_DIR

# start mapred daemons
"$bin"/start-mapred.sh --config $HADOOP_CONF_DIR

Which acutally invokes start-dfs.sh and start-mapred.sh in no time. However, there do need a time period after starting DFS sometimes, because the file system may be in safemode. (Btw: this script works well on my workstation. I think this is because greate configuration, and the laptop for hadoop is actually a 5-year old slow machine, grapping from others' "rubbish bin")

So what I do is Ifirst started the dfs and then waited for a few minutes. Only then did Ibrin up the mapred deamons. This worked!