nodemanager节点报错Unexpected error starting NodeStatusUpdater

来源:互联网 发布:android软件下载 编辑:程序博客网 时间:2024/05/16 17:10
问题描述:
一台nodemanager节点,出现无法正常启动情况,jps查看,发现nodemanager可以出现一会,过几秒就消失了,
查看日志发现如下信息:
2015-09-10 14:03:53,295 ERROR nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:serviceStart(195)) - Unexpected error starting NodeStatusUpdater
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: Disallowed NodeManager from  dn5, Sending SHUTDOWN signal to the NodeManager.
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:265)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:190)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:197)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:358)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:404)
2015-09-10 14:03:53,296 INFO  service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl failed in state STARTED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: Disallowed NodeManager from  dn51.20.bjlt, Sending SHUTDOWN signal to the NodeManager.
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: Disallowed NodeManager from  dn5, Sending SHUTDOWN signal to the NodeManager.
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:196)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceStart(NodeManager.java:197)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:358)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:404)
Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: Disallowed NodeManager from  dn51.20.bjlt, Sending SHUTDOWN signal to the NodeManager.
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:265)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:190)

大概的意思是无法向Resourcemanager注册,无法连接。


问题解决:
最后发现yarn.exclude的文件中存在这个节点的hostname,将这个主机从文件中删除,再启动就正常了。
yarn.exclude是yarn节点排除文件,一般在机器有问题下架的时候使用。
0 0