Map Reduce commit job 优化

来源：互联网发布：招python测试工程师编辑：程序博客网时间：2024/04/29 19:12

经常会看到用户的job在所有的map和reduce都完成之后，还需要几分钟时间才能finish。这个阶段主要在进行job output的commit过程。

MR v2中有进行这部分的优化。

https://issues.apache.org/jira/browse/MAPREDUCE-4815

https://issues.apache.org/jira/browse/MAPREDUCE-6275

https://issues.apache.org/jira/browse/MAPREDUCE-6280

目前看来在hadoop 2.7之后才有这些功能，但是还是有坑。

在FileOutputCommitter中的commitJob方法中，可以看到根据mapreduce.fileoutputcommitter.algorithm.version的不同，会有不同的处理逻辑。

org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java

mapreduce.fileoutputcommitter.algorithm.version

官方文档中的介绍https://hadoop.apache.org/docs/r2.7.0/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml非常清楚，如下：

总结来说，就是减少了一步rename的过程，而且老版本中commitJob是单线程串行rename大量output，这本身很花时间。现在新版本中，只是rename一个文件夹就行了，可以大大提高速度。

The file output committer algorithm version valid algorithm version number: 1 or 2 default to 1, which is the original algorithm In algorithm version 1, 1. commitTask will rename directory $joboutput/_temporary/$appAttemptID/_temporary/$taskAttemptID/ to $joboutput/_temporary/$appAttemptID/$taskID/ 2. recoverTask will also do a rename $joboutput/_temporary/$appAttemptID/$taskID/ to $joboutput/_temporary/($appAttemptID + 1)/$taskID/ 3. commitJob will merge every task output file in $joboutput/_temporary/$appAttemptID/$taskID/ to $joboutput/, then it will delete $joboutput/_temporary/ and write $joboutput/_SUCCESS It has a performance regression, which is discussed in MAPREDUCE-4815. If a job generates many files to commit then the commitJob method call at the end of the job can take minutes. the commit is single-threaded and waits until all tasks have completed before commencing. algorithm version 2 will change the behavior of commitTask, recoverTask, and commitJob. 1. commitTask will rename all files in $joboutput/_temporary/$appAttemptID/_temporary/$taskAttemptID/ to $joboutput/ 2. recoverTask actually doesn't require to do anything, but for upgrade from version 1 to version 2 case, it will check if there are any files in $joboutput/_temporary/($appAttemptID - 1)/$taskID/ and rename them to $joboutput/ 3. commitJob can simply delete $joboutput/_temporary and write $joboutput/_SUCCESS This algorithm will reduce the output commit time for large jobs by having the tasks commit directly to the final output directory as they were completing and commitJob had very little to do.

0 0