hadoop相关参数总结(在生产环境下)

来源：互联网发布：jsw windows x86 64 编辑：程序博客网时间：2024/05/22 23:59

关于使用hadoop的版本，我都是基于hadoop-0.20.205.0版本，后面将不再对版本进行说明。

以下参数来自一本hadoop书的配置总结，这里进行摘抄。这里不提供具体的操作步骤和原因，如果想知道详解可参考《Pro Hadoop》这本书。

关于Storage Allocations（影响存储分配参数）

参数影响是dfs.balance.bandwidthPerSec

The Balancer is run by the command start-balancer.sh

Reserved Disk Space（影响磁盘空间是否足够）

Hadoop Core provides four parameters:two for HDFS and two for MapReduce.

mapred.local.dir.minspacestart

mapred.local.dir.minspacekill

dfs.datanode.du.reserved

dfs.datanode.du.pct

Server Pending Connections（ipc请求设置连接队列大小）

parameter: ipc.server.listen.queue.size

NameNode Threads

parameter: dfs.namenode.handler.count

Block Service Threads

parameter: dfs.datanode.handler.count

File Descriptors（增大文件句柄数）

change /etc/security/limits.conf file of the following form:

* hard nofile 64000

this changes the per-user file descriptor limit to 64000 file descriptors. if you will run a much lager number of file descriptors,you may nedd to alter the per-system

limits via changes to fs.file-max in /etc/sysctl.conf 如下：

fs.file-max=64000

要使生效，前者在下一次登录就生效；后者需要重启系统。

另外就是使用noatime and nodiratime参数，详细设置参考《在linux下使用noatime提升文件系统性能》改参数修改用于datanode上面。

一般对于Secondary Namenode 元数据的存放最后放在那个独立的磁盘上面。（最安全的方式就是Namenode 和Secondary Namenode分在两台独立的机器上面）

关于相关参数是

fs.checkpoint.dir fs.checkpoint.period fs.checkpoint.size

The secondary NameNode periodically(fs.checkpoint.period) requests a checkpoint from the NameNode.At that point, the NameNode will close the current edit log and start a new edit log.

NameNode Disk I/O tuning

相关参数是dfs.name.dir和dfs.name.edits.dir两个，作用一个操作日志的记录，一个存放元数据。

DataNode Disk I/O Tuning

相关参数是dfs.data.dir、dfs.replication、dfs.block.size

Network I/O Tuning

dfs.datanode.dns.interface dfs.datanode.dns.nameserver

Recovery from Failure（以下都是常用如何从错误中进行恢复）

Namenode磁盘结构推荐采用RAID1的方式，而在datanode建议不采用RAID方式，如果非要采用，使用RAID5还可以接受。

Namenode Recovery

1、Shut down the secondary NameNode.

2、Copy the contents of the Secondary:fs.checkpoint.dir to Namenode: dfs.name.dir

3、Copy the contents of the Secondary:fs.checkpoint.edits.dir to the Namenode: dfs.name.edits.dir.

4、When the copy completes, you may start the Namenode and restart the secondary NameNode.

增加新节点后

要运行start-balancer.sh脚本

节点退出

1、Create a file on the NameNode machine with the hostnames or IP addresses of the DataNodes you wish to decommission, say /tmp/nodes_to_decommission,This file should contain on hostname or IP address per line, with standard Unx line endings.

2、Modify the hadoop-site.xml file by adding, or updating the following block:

<name>dfs.hosts.exclude</name>

<value>/tmp/nodes_to_decommission</value>

</property>

3、Run following command to start the decommissioning process:

hadoop dfsadmin -refreshNodes

Deleted File Recovery