hadoop Balance 优化

来源：互联网发布：vb 双引号转义字符编辑：程序博客网时间：2024/05/20 21:43

优化拷贝带宽（带宽的设置是影响datanode，设置单个datanode的balance带宽上限）：

[-setBalancerBandwidth <bandwidth in bytes per second>][@rm.tv.hadoop.sohuno.com ~]$ hdfs dfsadmin -setBalancerBandwidth 50000000Balancer bandwidth is set to 50000000 for nn.tv.hadoop.sohuno.com/10.10.34.89:8020Balancer bandwidth is set to 50000000 for rm.tv.hadoop.sohuno.com/10.10.34.90:8020

优化chooseNodes函数中选择source和dest的规则
若有机器磁盘使用率很高，则只拷贝over的。
若有新加入的机器，则只向under拷贝。

可以加大MAX_SIZE_TO_MOVE，使每次迭代中datanode拷贝更多的数据。默认是10GB

final private static long MAX_SIZE_TO_MOVE = 10*1024*1024*1024L; //10GB

加大每台datanode的并行拷贝数：
在balancer启动机器hdfs-site.xml中修改配置：dfs.datanode.balance.max.concurrent.moves 默认为5。同时需要修改source机器的该属性，否则会报异常，并且不生效。

由于4的限制，会导致balancer线程在datanode并行拷贝达到上限的时候无法继续运行，

方案一：在shouldFetchMoreBlocks()判断失败的时候不退出（将else全部注释掉）。

方案二：将MAX_NO_PENDING_MOVE_ITERATIONS 值改大，确保有足够的等待时间等那5个move完成。（默认值是5）

if (shouldFetchMoreBlocks()) {  // fetch new blocks  try {    blocksToReceive -= getBlockList();    continue;  } catch (IOException e) {    LOG.warn("Exception while getting block list", e);    return;  }} else {  // source node cannot find a pendingBlockToMove, iteration +1  noPendingBlockIteration++;  // in case no blocks can be moved for source node's task,  // jump out of while-loop after 5 iterations.  if (noPendingBlockIteration >= MAX_NO_PENDING_BLOCK_ITERATIONS) {    setScheduledSize(0);  }}

0 0