hadoop相关参数总结(在生产环境下)

来源:互联网 发布:jsw windows x86 64 编辑:程序博客网 时间:2024/05/22 23:59

关于使用hadoop的版本,我都是基于hadoop-0.20.205.0版本,后面将不再对版本进行说明。

 

以下参数来自一本hadoop书的配置总结,这里进行摘抄。这里不提供具体的操作步骤和原因,如果想知道详解可参考《Pro Hadoop》这本书。

 

关于Storage Allocations(影响存储分配参数)

参数影响是dfs.balance.bandwidthPerSec

The Balancer is run by the command start-balancer.sh

 

Reserved Disk Space(影响磁盘空间是否足够)

Hadoop Core provides four parameters:two for HDFS and two for MapReduce.

mapred.local.dir.minspacestart

mapred.local.dir.minspacekill

dfs.datanode.du.reserved

dfs.datanode.du.pct

 

Server Pending Connections(ipc请求设置连接队列大小)

parameter: ipc.server.listen.queue.size

 

NameNode Threads

parameter: dfs.namenode.handler.count

 

Block Service Threads

parameter: dfs.datanode.handler.count

 

File Descriptors(增大文件句柄数)

change  /etc/security/limits.conf file of the following form:

* hard  nofile 64000

this changes the per-user file descriptor limit to 64000 file descriptors. if you will run a much lager number of file descriptors,you may nedd to alter the per-system

limits via changes to fs.file-max in /etc/sysctl.conf 如下:

fs.file-max=64000

要使生效,前者在下一次登录就生效;后者需要重启系统。

 

另外就是使用noatime and nodiratime参数,详细设置参考《在linux下使用noatime提升文件系统性能》改参数修改用于datanode上面。

 

 一般对于Secondary Namenode 元数据的存放最后放在那个独立的磁盘上面。(最安全的方式就是Namenode 和Secondary Namenode分在两台独立的机器上面)

关于相关参数是

fs.checkpoint.dir     fs.checkpoint.period     fs.checkpoint.size

The secondary NameNode periodically(fs.checkpoint.period) requests a checkpoint from the NameNode.At that point, the NameNode will close the current edit log and start a new edit log.

 

NameNode Disk I/O tuning

相关参数是dfs.name.dir和dfs.name.edits.dir两个,作用一个操作日志的记录,一个存放元数据。

 

DataNode Disk I/O Tuning

相关参数是dfs.data.dir、dfs.replication、dfs.block.size

 

Network I/O Tuning

dfs.datanode.dns.interface   dfs.datanode.dns.nameserver

 

Recovery from Failure(以下都是常用如何从错误中进行恢复)

Namenode磁盘结构推荐采用RAID1的方式,而在datanode建议不采用RAID方式,如果非要采用,使用RAID5还可以接受。

Namenode Recovery

1、Shut down the secondary NameNode.

2、Copy the contents of the Secondary:fs.checkpoint.dir to Namenode: dfs.name.dir

3、Copy the contents of the Secondary:fs.checkpoint.edits.dir to the Namenode: dfs.name.edits.dir.

4、When the copy completes, you may start the Namenode and restart the secondary NameNode.

 

 增加新节点后

要运行start-balancer.sh脚本

 

节点退出

1、Create a file on the NameNode machine with the hostnames or IP addresses of the DataNodes you wish to decommission, say /tmp/nodes_to_decommission,This file should contain on hostname or IP address per line, with standard Unx line endings.

2、Modify the hadoop-site.xml file by adding, or updating the following block:

       <property>

       <name>dfs.hosts.exclude</name>

       <value>/tmp/nodes_to_decommission</value>

       </property>

3、Run following command to start the decommissioning process:

      hadoop dfsadmin -refreshNodes

 

 Deleted File Recovery

相关参数fs.trash.interval

 

Data Loss or Corruption

这个还是针对NameNode的处理(哪怕使用了RAID0和RAID5的话),操作步骤如下:

1、Archive the data if required.

2、Wipe all of the directories listed in dfs.name.dir.

3、Copy the contents of the fs.checkpoint.dir from the secondary NameNode to the fs.checkpoint.dir on the primary NameNode machine.

4、Run the following NameNode command:

      hadoop namenode -importCheckpoint

 

 

 

 

 

 

原创粉丝点击