Increasing the handler count can improve the performance of NameNode significantly

来源：互联网发布：python 爬虫 pdf 编辑：程序博客网时间：2024/05/16 07:15

In this test, use NNThroughputBenchmark to create directories, the dirs parameter is always 10000, and dirsPerDir is 128. Cpu cores is 8.

set handler count to 2

  <property>    <name>dfs.namenode.handler.count</name>    <value>2</value>  </property>

test with 2 threads, Elapsed Time: 102022

[houzhizhen@localhost hadoop]$ hadoop jar share/hadoop/hdfs/hadoop-hdfs-2.8.2-tests.jar org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -op mkdirs -threads 2 -dirs 10000 -dirsPerDir 12817/11/13 11:45:10 INFO namenode.NNThroughputBenchmark: Starting benchmark: mkdirs17/11/13 11:45:10 INFO namenode.NNThroughputBenchmark: Generate 10000 inputs for mkdirs17/11/13 11:45:10 FATAL namenode.NNThroughputBenchmark: Log level = ERROR17/11/13 11:45:10 INFO namenode.NNThroughputBenchmark: Starting 10000 mkdirs(s).17/11/13 11:46:52 INFO namenode.NNThroughputBenchmark: 17/11/13 11:46:52 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs ---17/11/13 11:46:52 INFO namenode.NNThroughputBenchmark: nrDirs = 1000017/11/13 11:46:52 INFO namenode.NNThroughputBenchmark: nrThreads = 217/11/13 11:46:52 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 12817/11/13 11:46:52 INFO namenode.NNThroughputBenchmark: --- mkdirs stats  ---17/11/13 11:46:52 INFO namenode.NNThroughputBenchmark: # operations: 1000017/11/13 11:46:52 INFO namenode.NNThroughputBenchmark: Elapsed Time: 10202217/11/13 11:46:52 INFO namenode.NNThroughputBenchmark:  Ops per sec: 98.0180745329438717/11/13 11:46:52 INFO namenode.NNThroughputBenchmark: Average Time: 20

test with 20 threads, Elapsed Time: 102154

This indicates that increase the client threads will not result in the performance increase.

[houzhizhen@localhost hadoop]$ hadoop jar share/hadoop/hdfs/hadoop-hdfs-2.8.2-tests.jar org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -op mkdirs -threads 20 -dirs 10000 -dirsPerDir 12817/11/13 11:47:33 INFO namenode.NNThroughputBenchmark: Starting benchmark: mkdirs17/11/13 11:47:33 INFO namenode.NNThroughputBenchmark: Generate 10000 inputs for mkdirs17/11/13 11:47:33 FATAL namenode.NNThroughputBenchmark: Log level = ERROR17/11/13 11:47:33 INFO namenode.NNThroughputBenchmark: Starting 10000 mkdirs(s).17/11/13 11:49:15 INFO namenode.NNThroughputBenchmark: 17/11/13 11:49:15 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs ---17/11/13 11:49:15 INFO namenode.NNThroughputBenchmark: nrDirs = 1000017/11/13 11:49:15 INFO namenode.NNThroughputBenchmark: nrThreads = 2017/11/13 11:49:15 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 12817/11/13 11:49:15 INFO namenode.NNThroughputBenchmark: --- mkdirs stats  ---17/11/13 11:49:15 INFO namenode.NNThroughputBenchmark: # operations: 1000017/11/13 11:49:15 INFO namenode.NNThroughputBenchmark: Elapsed Time: 10215417/11/13 11:49:15 INFO namenode.NNThroughputBenchmark:  Ops per sec: 97.8914188382246417/11/13 11:49:15 INFO namenode.NNThroughputBenchmark: Average Time: 204

set handler count to 20, and restart the namenode.

  <property>    <name>dfs.namenode.handler.count</name>    <value>20</value>  </property>

test with 20 threads, Elapsed Time: 9799

This indicates that performance improved 10 times.

hadoop jar share/hadoop/hdfs/hadoop-hdfs-2.8.2-tests.jar org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -op mkdirs -threads 20 -dirs 10000 -dirsPerDir 12817/11/13 11:50:38 INFO namenode.NNThroughputBenchmark: Starting benchmark: mkdirs17/11/13 11:50:38 INFO namenode.NNThroughputBenchmark: Generate 10000 inputs for mkdirs17/11/13 11:50:38 FATAL namenode.NNThroughputBenchmark: Log level = ERROR17/11/13 11:50:38 INFO namenode.NNThroughputBenchmark: Starting 10000 mkdirs(s).17/11/13 11:50:47 INFO namenode.NNThroughputBenchmark: 17/11/13 11:50:47 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs ---17/11/13 11:50:47 INFO namenode.NNThroughputBenchmark: nrDirs = 1000017/11/13 11:50:47 INFO namenode.NNThroughputBenchmark: nrThreads = 2017/11/13 11:50:47 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 12817/11/13 11:50:47 INFO namenode.NNThroughputBenchmark: --- mkdirs stats  ---17/11/13 11:50:47 INFO namenode.NNThroughputBenchmark: # operations: 1000017/11/13 11:50:47 INFO namenode.NNThroughputBenchmark: Elapsed Time: 979917/11/13 11:50:47 INFO namenode.NNThroughputBenchmark:  Ops per sec: 1020.51229717318117/11/13 11:50:47 INFO namenode.NNThroughputBenchmark: Average Time: 19

test with 30 threads, Elapsed Time: 9695

This indicates that performance is the same as test with 20 threads.

[houzhizhen@localhost hadoop]$ hadoop jar share/hadoop/hdfs/hadoop-hdfs-2.8.2-tests.jar org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -op mkdirs -threads 30 -dirs 10000 -dirsPerDir 12817/11/13 11:52:32 INFO namenode.NNThroughputBenchmark: Starting benchmark: mkdirs17/11/13 11:52:32 INFO namenode.NNThroughputBenchmark: Generate 10000 inputs for mkdirs17/11/13 11:52:32 FATAL namenode.NNThroughputBenchmark: Log level = ERROR17/11/13 11:52:32 INFO namenode.NNThroughputBenchmark: Starting 10000 mkdirs(s).17/11/13 11:52:42 INFO namenode.NNThroughputBenchmark: 17/11/13 11:52:42 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs ---17/11/13 11:52:42 INFO namenode.NNThroughputBenchmark: nrDirs = 1000017/11/13 11:52:42 INFO namenode.NNThroughputBenchmark: nrThreads = 3017/11/13 11:52:42 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 12817/11/13 11:52:42 INFO namenode.NNThroughputBenchmark: --- mkdirs stats  ---17/11/13 11:52:42 INFO namenode.NNThroughputBenchmark: # operations: 1000017/11/13 11:52:42 INFO namenode.NNThroughputBenchmark: Elapsed Time: 969517/11/13 11:52:42 INFO namenode.NNThroughputBenchmark:  Ops per sec: 1031.459515214027817/11/13 11:52:42 INFO namenode.NNThroughputBenchmark: Average Time: 28

set handler count to 200, and restart the namenode.

  <property>    <name>dfs.namenode.handler.count</name>    <value>200</value>  </property>

test with 20 threads, Elapsed Time: 11288

[houzhizhen@localhost hadoop]$ hadoop jar share/hadoop/hdfs/hadoop-hdfs-2.8.2-tests.jar org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -op mkdirs -threads 20 -dirs 10000 -dirsPerDir 12817/11/13 14:17:39 INFO namenode.NNThroughputBenchmark: Starting benchmark: mkdirs17/11/13 14:17:39 INFO namenode.NNThroughputBenchmark: Generate 10000 inputs for mkdirs17/11/13 14:17:39 FATAL namenode.NNThroughputBenchmark: Log level = ERROR17/11/13 14:17:39 INFO namenode.NNThroughputBenchmark: Starting 10000 mkdirs(s).17/11/13 14:17:50 INFO namenode.NNThroughputBenchmark: 17/11/13 14:17:50 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs ---17/11/13 14:17:50 INFO namenode.NNThroughputBenchmark: nrDirs = 1000017/11/13 14:17:50 INFO namenode.NNThroughputBenchmark: nrThreads = 2017/11/13 14:17:50 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 12817/11/13 14:17:50 INFO namenode.NNThroughputBenchmark: --- mkdirs stats  ---17/11/13 14:17:50 INFO namenode.NNThroughputBenchmark: # operations: 1000017/11/13 14:17:50 INFO namenode.NNThroughputBenchmark: Elapsed Time: 1128817/11/13 14:17:50 INFO namenode.NNThroughputBenchmark:  Ops per sec: 885.89652728561317/11/13 14:17:50 INFO namenode.NNThroughputBenchmark: Average Time: 22

test with 200 threads, Elapsed Time: 1285

Indicates that performance improved 10 times.

[houzhizhen@localhost hadoop]$ hadoop jar share/hadoop/hdfs/hadoop-hdfs-2.8.2-tests.jar org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -op mkdirs -threads 200 -dirs 10000 -dirsPerDir 12817/11/13 14:23:27 INFO namenode.NNThroughputBenchmark: Starting benchmark: mkdirs17/11/13 14:23:27 INFO namenode.NNThroughputBenchmark: Generate 10000 inputs for mkdirs17/11/13 14:23:27 FATAL namenode.NNThroughputBenchmark: Log level = ERROR17/11/13 14:23:27 INFO namenode.NNThroughputBenchmark: Starting 10000 mkdirs(s).17/11/13 14:23:28 INFO namenode.NNThroughputBenchmark: 17/11/13 14:23:28 INFO namenode.NNThroughputBenchmark: --- mkdirs inputs ---17/11/13 14:23:28 INFO namenode.NNThroughputBenchmark: nrDirs = 1000017/11/13 14:23:28 INFO namenode.NNThroughputBenchmark: nrThreads = 20017/11/13 14:23:28 INFO namenode.NNThroughputBenchmark: nrDirsPerDir = 12817/11/13 14:23:28 INFO namenode.NNThroughputBenchmark: --- mkdirs stats  ---17/11/13 14:23:28 INFO namenode.NNThroughputBenchmark: # operations: 1000017/11/13 14:23:28 INFO namenode.NNThroughputBenchmark: Elapsed Time: 128517/11/13 14:23:28 INFO namenode.NNThroughputBenchmark:  Ops per sec: 7782.10116731517517/11/13 14:23:28 INFO namenode.NNThroughputBenchmark: Average Time: 23

set handler count to 2000, and restart the namenode, but there is no performance improvement. So it is safe to set handler to about 25 times of cores.

阅读全文

0 0