hadoop中namenode HA起不了的问题解决

来源:互联网 发布:java判断空格 编辑:程序博客网 时间:2024/05/03 14:23

原先的自己的namenode HA的机器坏了,我就新加了一台机器作为namenodeHA,改了一些配置,还有配了一下免登陆,之后重启了一下集群,发现nn的HA没有起来,我就查看了一下log,发现报如下错误
2017-07-15 19:34:56,231 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/hadoop/hadoop-2.4.1/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:298)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:891)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:638)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:503)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:559)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:724)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:708)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1358)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1424)
2017-07-15 19:34:56,243 INFO org.mortbay.log: Stopped SelectChannelConnector@liu2:50070
2017-07-15 19:34:56,243 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Stopping NameNode metrics system...
2017-07-15 19:34:56,244 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system stopped.
2017-07-15 19:34:56,244 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system shutdown complete.
2017-07-15 19:34:56,244 FATAL org.apache.hadoop.hdfs.server.namenode.NameNode: Exception in namenode join
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /usr/hadoop/hadoop-2.4.1/tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverStorageDirs(FSImage.java:298)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:202)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:891)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:638)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:503)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:559)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:724)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:708)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1358)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1424)
2017-07-15 19:34:56,246 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2017-07-15 19:34:56,249 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:

按提示来说是找不到/usr/hadoop/hadoop-2.4.1/tmp/dfs/name这个目录,我找了一下,真的找不到,然后我在stackoverflow中找到这样的解决方式

If you have already formatted the NameNode, or are converting a non-HA-enabled cluster to be HA-enabled, you should now copy over the contents of your NameNode metadata directories to the other, unformatted NameNode by running the command “hdfs namenode -bootstrapStandby” on the unformatted NameNode. Running this command will also ensure that the JournalNodes (as configured by dfs.namenode.shared.edits.dir) contain sufficient edits transactions to be able to start both NameNodes.

大致意思就是增加一个HA除了要把nn的元数据也就是tmp目录下的文件scp过来之外,还要执行一下
hdfs namenode -bootstrapStandby,然后执行完毕,重启一下集群,HA就OK了。

原创粉丝点击