HDFS JAVA API APPEND操作异常

来源:互联网 发布:宁波畅想软件 编辑:程序博客网 时间:2024/06/05 03:03

java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try.

背景

关于这个异常的介绍可以查看hadoop exceptions
上面帖子中解释是两个配置项的问题,官方文档对这两个配置项的解释是:

  • dfs.client.block.write.replace-datanode-on-failure.enable

If there is a datanode/network failure in the write pipeline, DFSClient will try to remove the failed datanode from the pipeline and then continue writing with the remaining datanodes. As a result, the number of datanodes in the pipeline is decreased. The feature is to add new datanodes to the pipeline. This is a site-wide property to enable/disable the feature. When the cluster size is extremely small, e.g. 3 nodes or less, cluster administrators may want to set the policy to NEVER in the default configuration file or disable this feature. Otherwise, users may experience an unusually high rate of pipeline failures since it is impossible to find new datanodes for replacement. See also dfs.client.block.write.replace-datanode-on-failure.policy

在进行pipeline写数据时,如果DN或者网络故障,客户端将尝试移除失败的DN,然后写到剩下的DN。这样的结果是pipeline中的DN减少了。该属性用于配置这种情况下是否添加新的DN到pipeline。这是一个站点范围的选项。当集群规模非常小时,例如3个或者更小,集群管理者可能想在默认配置文件中设置遇到故障时使用的策略为NEVER,或者禁止掉此属性。否则,客户端将会因为找不到新的DN来代替而遇到反常的高概率的写入失败,详情见dfs.client.block.write.replace-datanode-on-failure.policy

  • dfs.client.block.write.replace-datanode-on-failure.policy

This property is used only if the value of dfs.client.block.write.replace-datanode-on-failure.enable is true. ALWAYS: always add a new datanode when an existing datanode is removed. NEVER: never add a new datanode. DEFAULT: Let r be the replication number. Let n be the number of existing datanodes. Add a new datanode only if r is greater than or equal to 3 and either (1) floor(r/2) is greater than or equal to n; or (2) r is greater than n and the block is hflushed/appended.

此属性仅在dfs.client.block.write.replace-datanode-on-failure.enable设置为true时有效
ALWAYS: 总是添加新的DN
NEVER: 从不添加新的DN
DEFAULT: 设r是副本数,n是现有的DN数。在满足以下条件时添加新的DN:
r>=3 && ( [r/2向下取整]>=n || (r>n && 文件在被hflushed或appended) )

可查看hdfs-default.xml

解决方法

官方已经给出解决方法,就是修改配置项,在hdfs-site.xml中添加或修改:

<property>  <name>dfs.client.block.write.replace-datanode-on-failure.enable</name>  <value>true</value></property><property>  <name>dfs.client.block.write.replace-datanode-on-failure.policy</name>  <value>NEVER</value></property>

注意要将配置文件放在类路径(resources文件夹)中然后使用Hadoop的Configuration类读取

或者可以在代码中修改客户端使用的Configuration:

Configuration config = new Configuration();config.set("dfs.support.append", "true");config.set("dfs.client.block.write.replace-datanode-on-failure.policy", "NEVER");config.set("dfs.client.block.write.replace-datanode-on-failure.enable", "true");

疑惑

令人疑惑的是,我使用修改配置项的方法并不奏效,只有使用添加代码的方式才行,网上一查发现也有人发帖 帖子

我的做法是,使用修改配置项的方法,将配置文件放到项目的resources文件夹下,然后使用Configuration.addDefaultResource(“hdfs-site.xml”)添加,使用config.get(“xxx”)也查看并确认了这三个配置已经有了正确值,然而一直不奏效。。。

作为有使用配置文件癖好的我,无奈之下只好使用添加代码的方法(这个方法可以奏效),但是明明这两种方法没有什么区别的吧。。。想到以后在真正的集群中是不会遇到这种情景就暂(yong)时(yuan)放下了(真正集群中DN肯定大于3)。。。请有想法和意见的朋友在评论区留言,谢谢~~

0 0
原创粉丝点击