如何利用MapReduce的分治策略提高KNN算法的运行速度
来源:互联网 发布:什么手机支持4g十网络 编辑:程序博客网 时间:2024/06/05 06:52
集群环境介绍:
hadoop2.4.1 64位6台服务器:hadoop11 NameNode 、SecondaryNameNodehadoop22 ResourceManagerhadoop33 DataNode、NodeManagerhadoop44 DataNode、NodeManagerhadoop55 DataNode、NodeManagerhadoop66 DataNode、NodeManager
实验1:训练集train.txt样例个数为245057(3.24M) 测试集test.txt样例个数为51444(640kb),并将全部测试集都存放在test.txt中
[root@hadoop11 local]# hadoop fs -lsr /dir6/-rw-r--r-- 3 root supergroup 3400816 2016-07-17 19:28 /dir6/test.txt注意:此时所有的测试集都在一个文本中(test.txt)存放,作为输入路径
KNN算法运行日志:
16/07/17 19:32:24 INFO client.RMProxy: Connecting to ResourceManager at hadoop22/10.187.84.51:803216/07/17 19:32:25 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.16/07/17 19:32:25 INFO input.FileInputFormat: Total input paths to process : 116/07/17 19:32:25 INFO mapreduce.JobSubmitter: number of splits:116/07/17 19:32:26 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1468752229715_001616/07/17 19:32:26 INFO impl.YarnClientImpl: Submitted application application_1468752229715_001616/07/17 19:32:26 INFO mapreduce.Job: The url to track the job: http://hadoop22:8088/proxy/application_1468752229715_0016/16/07/17 19:32:26 INFO mapreduce.Job: Running job: job_1468752229715_001616/07/17 19:32:32 INFO mapreduce.Job: Job job_1468752229715_0016 running in uber mode : false16/07/17 19:32:32 INFO mapreduce.Job: map 0% reduce 0%16/07/17 19:32:49 INFO mapreduce.Job: map 1% reduce 0%16/07/17 19:33:05 INFO mapreduce.Job: map 2% reduce 0%16/07/17 19:33:20 INFO mapreduce.Job: map 3% reduce 0%16/07/17 19:33:35 INFO mapreduce.Job: map 4% reduce 0%16/07/17 19:33:50 INFO mapreduce.Job: map 5% reduce 0%16/07/17 19:34:02 INFO mapreduce.Job: map 6% reduce 0%16/07/17 19:34:17 INFO mapreduce.Job: map 7% reduce 0%16/07/17 19:34:32 INFO mapreduce.Job: map 8% reduce 0%16/07/17 19:34:47 INFO mapreduce.Job: map 9% reduce 0%16/07/17 19:35:02 INFO mapreduce.Job: map 10% reduce 0%16/07/17 19:35:14 INFO mapreduce.Job: map 11% reduce 0%16/07/17 19:35:29 INFO mapreduce.Job: map 12% reduce 0%16/07/17 19:35:44 INFO mapreduce.Job: map 13% reduce 0%16/07/17 19:35:59 INFO mapreduce.Job: map 14% reduce 0%16/07/17 19:36:12 INFO mapreduce.Job: map 15% reduce 0%16/07/17 19:36:27 INFO mapreduce.Job: map 16% reduce 0%16/07/17 19:36:42 INFO mapreduce.Job: map 17% reduce 0%16/07/17 19:36:57 INFO mapreduce.Job: map 18% reduce 0%16/07/17 19:37:12 INFO mapreduce.Job: map 19% reduce 0%16/07/17 19:37:27 INFO mapreduce.Job: map 20% reduce 0%16/07/17 19:37:39 INFO mapreduce.Job: map 21% reduce 0%16/07/17 19:37:54 INFO mapreduce.Job: map 22% reduce 0%16/07/17 19:38:09 INFO mapreduce.Job: map 23% reduce 0%16/07/17 19:38:24 INFO mapreduce.Job: map 24% reduce 0%16/07/17 19:38:39 INFO mapreduce.Job: map 25% reduce 0%16/07/17 19:38:51 INFO mapreduce.Job: map 26% reduce 0%16/07/17 19:39:06 INFO mapreduce.Job: map 27% reduce 0%16/07/17 19:39:22 INFO mapreduce.Job: map 28% reduce 0%16/07/17 19:39:37 INFO mapreduce.Job: map 29% reduce 0%16/07/17 19:39:52 INFO mapreduce.Job: map 30% reduce 0%16/07/17 19:40:07 INFO mapreduce.Job: map 31% reduce 0%16/07/17 19:40:22 INFO mapreduce.Job: map 32% reduce 0%16/07/17 19:40:37 INFO mapreduce.Job: map 33% reduce 0%16/07/17 19:40:52 INFO mapreduce.Job: map 34% reduce 0%16/07/17 19:41:04 INFO mapreduce.Job: map 35% reduce 0%16/07/17 19:41:22 INFO mapreduce.Job: map 36% reduce 0%16/07/17 19:41:37 INFO mapreduce.Job: map 37% reduce 0%16/07/17 19:41:52 INFO mapreduce.Job: map 38% reduce 0%16/07/17 19:42:07 INFO mapreduce.Job: map 39% reduce 0%16/07/17 19:42:22 INFO mapreduce.Job: map 40% reduce 0%16/07/17 19:42:37 INFO mapreduce.Job: map 41% reduce 0%16/07/17 19:42:53 INFO mapreduce.Job: map 42% reduce 0%16/07/17 19:43:08 INFO mapreduce.Job: map 43% reduce 0%16/07/17 19:43:23 INFO mapreduce.Job: map 44% reduce 0%16/07/17 19:43:41 INFO mapreduce.Job: map 45% reduce 0%16/07/17 19:43:56 INFO mapreduce.Job: map 46% reduce 0%16/07/17 19:44:12 INFO mapreduce.Job: map 47% reduce 0%16/07/17 19:44:30 INFO mapreduce.Job: map 48% reduce 0%16/07/17 19:44:45 INFO mapreduce.Job: map 49% reduce 0%16/07/17 19:45:00 INFO mapreduce.Job: map 50% reduce 0%16/07/17 19:45:15 INFO mapreduce.Job: map 51% reduce 0%16/07/17 19:45:30 INFO mapreduce.Job: map 52% reduce 0%16/07/17 19:45:48 INFO mapreduce.Job: map 53% reduce 0%16/07/17 19:46:03 INFO mapreduce.Job: map 54% reduce 0%16/07/17 19:46:18 INFO mapreduce.Job: map 55% reduce 0%16/07/17 19:46:33 INFO mapreduce.Job: map 56% reduce 0%16/07/17 19:46:49 INFO mapreduce.Job: map 57% reduce 0%16/07/17 19:47:07 INFO mapreduce.Job: map 58% reduce 0%16/07/17 19:47:22 INFO mapreduce.Job: map 59% reduce 0%16/07/17 19:47:37 INFO mapreduce.Job: map 60% reduce 0%16/07/17 19:47:55 INFO mapreduce.Job: map 61% reduce 0%16/07/17 19:48:10 INFO mapreduce.Job: map 62% reduce 0%16/07/17 19:48:25 INFO mapreduce.Job: map 63% reduce 0%16/07/17 19:48:43 INFO mapreduce.Job: map 64% reduce 0%16/07/17 19:48:58 INFO mapreduce.Job: map 65% reduce 0%16/07/17 19:49:13 INFO mapreduce.Job: map 66% reduce 0%16/07/17 19:49:28 INFO mapreduce.Job: map 67% reduce 0%16/07/17 19:49:30 INFO mapreduce.Job: map 100% reduce 0%16/07/17 19:49:37 INFO mapreduce.Job: map 100% reduce 100%16/07/17 19:49:38 INFO mapreduce.Job: Job job_1468752229715_0016 completed successfully16/07/17 19:49:39 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=2892255 FILE: Number of bytes written=5971253 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=4056338 HDFS: Number of bytes written=861195 HDFS: Number of read operations=7 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=1016177 Total time spent by all reduces in occupied slots (ms)=4948 Total time spent by all map tasks (ms)=1016177 Total time spent by all reduce tasks (ms)=4948 Total vcore-seconds taken by all map tasks=1016177 Total vcore-seconds taken by all reduce tasks=4948 Total megabyte-seconds taken by all map tasks=1040565248 Total megabyte-seconds taken by all reduce tasks=5066752 Map-Reduce Framework Map input records=51444 Map output records=154332 Map output bytes=2583585 Map output materialized bytes=2892255 Input split bytes=103 Combine input records=0 Combine output records=0 Reduce input groups=51444 Reduce shuffle bytes=2892255 Reduce input records=154332 Reduce output records=51444 Spilled Records=308664 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=5836 CPU time spent (ms)=1033510 Physical memory (bytes) snapshot=517627904 Virtual memory (bytes) snapshot=1786634240 Total committed heap usage (bytes)=306774016 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=655419 File Output Format Counters Bytes Written=861195
统计:
精确度:51444 51367CPU time spent (ms)=1033510map tasks=1
实验2:训练集train.txt样例个数为245057不变 测试集test.txt样例个数为51444,并将全部测试集存放在
test1.txt(25568)和test2.txt(25857)中
[root@hadoop11 local]# hadoop fs -lsr /dir6/-rw-r--r-- 3 root supergroup 368774 2016-07-17 20:15 /dir6/test1.txt-rw-r--r-- 3 root supergroup 312210 2016-07-17 20:15 /dir6/test2.txt
KNN算法运行日志:
先看进程日志:
[root@hadoop66 ~]# jps24659 YarnChild (mapper任务)22777 DataNode25592 Jps24660 YarnChild (mapper任务)24557 MRAppMaster22622 NodeManager
计数器日志:
[root@hadoop11 local]# app1.sh16/07/17 20:21:03 INFO client.RMProxy: Connecting to ResourceManager at hadoop22/10.187.84.51:803216/07/17 20:21:03 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.16/07/17 20:21:03 INFO input.FileInputFormat: Total input paths to process : 216/07/17 20:21:03 INFO mapreduce.JobSubmitter: number of splits:216/07/17 20:21:03 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1468752229715_001916/07/17 20:21:04 INFO impl.YarnClientImpl: Submitted application application_1468752229715_001916/07/17 20:21:04 INFO mapreduce.Job: The url to track the job: http://hadoop22:8088/proxy/application_1468752229715_0019/16/07/17 20:21:04 INFO mapreduce.Job: Running job: job_1468752229715_001916/07/17 20:21:10 INFO mapreduce.Job: Job job_1468752229715_0019 running in uber mode : false16/07/17 20:21:10 INFO mapreduce.Job: map 0% reduce 0%16/07/17 20:21:21 INFO mapreduce.Job: map 1% reduce 0%16/07/17 20:21:30 INFO mapreduce.Job: map 2% reduce 0%16/07/17 20:21:40 INFO mapreduce.Job: map 3% reduce 0%16/07/17 20:21:46 INFO mapreduce.Job: map 4% reduce 0%16/07/17 20:21:55 INFO mapreduce.Job: map 5% reduce 0%16/07/17 20:22:01 INFO mapreduce.Job: map 6% reduce 0%16/07/17 20:22:10 INFO mapreduce.Job: map 7% reduce 0%16/07/17 20:22:17 INFO mapreduce.Job: map 8% reduce 0%16/07/17 20:22:26 INFO mapreduce.Job: map 9% reduce 0%16/07/17 20:22:35 INFO mapreduce.Job: map 10% reduce 0%16/07/17 20:22:41 INFO mapreduce.Job: map 11% reduce 0%16/07/17 20:22:47 INFO mapreduce.Job: map 12% reduce 0%16/07/17 20:22:56 INFO mapreduce.Job: map 13% reduce 0%16/07/17 20:23:05 INFO mapreduce.Job: map 14% reduce 0%16/07/17 20:23:11 INFO mapreduce.Job: map 15% reduce 0%16/07/17 20:23:17 INFO mapreduce.Job: map 16% reduce 0%16/07/17 20:23:26 INFO mapreduce.Job: map 17% reduce 0%16/07/17 20:23:35 INFO mapreduce.Job: map 18% reduce 0%16/07/17 20:23:41 INFO mapreduce.Job: map 19% reduce 0%16/07/17 20:23:50 INFO mapreduce.Job: map 20% reduce 0%16/07/17 20:23:56 INFO mapreduce.Job: map 21% reduce 0%16/07/17 20:24:05 INFO mapreduce.Job: map 22% reduce 0%16/07/17 20:24:11 INFO mapreduce.Job: map 23% reduce 0%16/07/17 20:24:20 INFO mapreduce.Job: map 24% reduce 0%16/07/17 20:24:26 INFO mapreduce.Job: map 25% reduce 0%16/07/17 20:24:35 INFO mapreduce.Job: map 26% reduce 0%16/07/17 20:24:42 INFO mapreduce.Job: map 27% reduce 0%16/07/17 20:24:51 INFO mapreduce.Job: map 28% reduce 0%16/07/17 20:24:57 INFO mapreduce.Job: map 29% reduce 0%16/07/17 20:25:06 INFO mapreduce.Job: map 30% reduce 0%16/07/17 20:25:12 INFO mapreduce.Job: map 31% reduce 0%16/07/17 20:25:21 INFO mapreduce.Job: map 32% reduce 0%16/07/17 20:25:27 INFO mapreduce.Job: map 33% reduce 0%16/07/17 20:25:36 INFO mapreduce.Job: map 34% reduce 0%16/07/17 20:25:42 INFO mapreduce.Job: map 35% reduce 0%16/07/17 20:25:51 INFO mapreduce.Job: map 36% reduce 0%16/07/17 20:25:57 INFO mapreduce.Job: map 37% reduce 0%16/07/17 20:26:06 INFO mapreduce.Job: map 38% reduce 0%16/07/17 20:26:12 INFO mapreduce.Job: map 39% reduce 0%16/07/17 20:26:21 INFO mapreduce.Job: map 40% reduce 0%16/07/17 20:26:30 INFO mapreduce.Job: map 41% reduce 0%16/07/17 20:26:36 INFO mapreduce.Job: map 42% reduce 0%16/07/17 20:26:45 INFO mapreduce.Job: map 43% reduce 0%16/07/17 20:26:51 INFO mapreduce.Job: map 44% reduce 0%16/07/17 20:27:00 INFO mapreduce.Job: map 45% reduce 0%16/07/17 20:27:06 INFO mapreduce.Job: map 46% reduce 0%16/07/17 20:27:15 INFO mapreduce.Job: map 47% reduce 0%16/07/17 20:27:21 INFO mapreduce.Job: map 48% reduce 0%16/07/17 20:27:30 INFO mapreduce.Job: map 49% reduce 0%16/07/17 20:27:36 INFO mapreduce.Job: map 50% reduce 0%16/07/17 20:27:45 INFO mapreduce.Job: map 51% reduce 0%16/07/17 20:27:51 INFO mapreduce.Job: map 52% reduce 0%16/07/17 20:28:01 INFO mapreduce.Job: map 53% reduce 0%16/07/17 20:28:07 INFO mapreduce.Job: map 54% reduce 0%16/07/17 20:28:16 INFO mapreduce.Job: map 55% reduce 0%16/07/17 20:28:23 INFO mapreduce.Job: map 56% reduce 0%16/07/17 20:28:31 INFO mapreduce.Job: map 57% reduce 0%16/07/17 20:28:38 INFO mapreduce.Job: map 58% reduce 0%16/07/17 20:28:46 INFO mapreduce.Job: map 59% reduce 0%16/07/17 20:28:53 INFO mapreduce.Job: map 60% reduce 0%16/07/17 20:29:02 INFO mapreduce.Job: map 61% reduce 0%16/07/17 20:29:10 INFO mapreduce.Job: map 62% reduce 0%16/07/17 20:29:17 INFO mapreduce.Job: map 63% reduce 0%16/07/17 20:29:26 INFO mapreduce.Job: map 64% reduce 0%16/07/17 20:29:32 INFO mapreduce.Job: map 65% reduce 0%16/07/17 20:29:41 INFO mapreduce.Job: map 66% reduce 0%16/07/17 20:29:42 INFO mapreduce.Job: map 83% reduce 0%16/07/17 20:29:52 INFO mapreduce.Job: map 83% reduce 17%16/07/17 20:29:54 INFO mapreduce.Job: map 100% reduce 17%16/07/17 20:29:55 INFO mapreduce.Job: map 100% reduce 70%16/07/17 20:29:56 INFO mapreduce.Job: map 100% reduce 100%16/07/17 20:29:56 INFO mapreduce.Job: Job job_1468752229715_0019 completed successfully16/07/17 20:29:56 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=2892255 FILE: Number of bytes written=6064619 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=7482816 HDFS: Number of bytes written=861195 HDFS: Number of read operations=11 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=2 Launched reduce tasks=1 Data-local map tasks=2 Total time spent by all maps in occupied slots (ms)=1032086 Total time spent by all reduces in occupied slots (ms)=11757 Total time spent by all map tasks (ms)=1032086 Total time spent by all reduce tasks (ms)=11757 Total vcore-seconds taken by all map tasks=1032086 Total vcore-seconds taken by all reduce tasks=11757 Total megabyte-seconds taken by all map tasks=1056856064 Total megabyte-seconds taken by all reduce tasks=12039168 Map-Reduce Framework Map input records=51444 Map output records=154332 Map output bytes=2583585 Map output materialized bytes=2892261 Input split bytes=200 Combine input records=0 Combine output records=0 Reduce input groups=51444 Reduce shuffle bytes=2892261 Reduce input records=154332 Reduce output records=51444 Spilled Records=308664 Shuffled Maps =2 Failed Shuffles=0 Merged Map outputs=2 GC time elapsed (ms)=8264 CPU time spent (ms)=1045670 Physical memory (bytes) snapshot=762257408 Virtual memory (bytes) snapshot=2654359552 Total committed heap usage (bytes)=496762880 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=680984 File Output Format Counters Bytes Written=86119516/07/17 20:29:58 INFO client.RMProxy: Connecting to ResourceManager at hadoop22/10.187.84.51:803216/07/17 20:29:59 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.16/07/17 20:29:59 INFO input.FileInputFormat: Total input paths to process : 116/07/17 20:29:59 INFO mapreduce.JobSubmitter: number of splits:116/07/17 20:29:59 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1468752229715_002016/07/17 20:29:59 INFO impl.YarnClientImpl: Submitted application application_1468752229715_002016/07/17 20:30:00 INFO mapreduce.Job: The url to track the job: http://hadoop22:8088/proxy/application_1468752229715_0020/16/07/17 20:30:00 INFO mapreduce.Job: Running job: job_1468752229715_002016/07/17 20:30:05 INFO mapreduce.Job: Job job_1468752229715_0020 running in uber mode : false16/07/17 20:30:05 INFO mapreduce.Job: map 0% reduce 0%16/07/17 20:30:12 INFO mapreduce.Job: map 100% reduce 0%16/07/17 20:30:18 INFO mapreduce.Job: map 100% reduce 100%16/07/17 20:30:18 INFO mapreduce.Job: Job job_1468752229715_0020 completed successfully16/07/17 20:30:18 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=24 FILE: Number of bytes written=186173 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=861298 HDFS: Number of bytes written=12 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=3973 Total time spent by all reduces in occupied slots (ms)=3243 Total time spent by all map tasks (ms)=3973 Total time spent by all reduce tasks (ms)=3243 Total vcore-seconds taken by all map tasks=3973 Total vcore-seconds taken by all reduce tasks=3243 Total megabyte-seconds taken by all map tasks=4068352 Total megabyte-seconds taken by all reduce tasks=3320832 Map-Reduce Framework Map input records=51444 Map output records=1 Map output bytes=16 Map output materialized bytes=24 Input split bytes=103 Combine input records=0 Combine output records=0 Reduce input groups=1 Reduce shuffle bytes=24 Reduce input records=1 Reduce output records=1 Spilled Records=2 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=70 CPU time spent (ms)=2340 Physical memory (bytes) snapshot=451612672 Virtual memory (bytes) snapshot=1790021632 Total committed heap usage (bytes)=309002240 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=861195 File Output Format Counters Bytes Written=12
统计:
精确度:51444 51367CPU time spent (ms)=1045670 (时间之所以长:在于mapper任务的创建花费了时间,并且两个mapper任务都在同一个服务器hadoop66运行)map tasks=2
实验3:训练集train.txt样例个数为245057不变 测试集test.txt样例个数为51444,并将全部测试集存放在
test1.txt(25402)和test2.txt(15224)和test3.txt(10818)中
[root@hadoop11 local]# hadoop fs -lsr /dir6/lsr: DEPRECATED: Please use 'ls -R' instead.-rw-r--r-- 3 root supergroup 128161 2016-07-17 20:54 /dir6/test1.txt-rw-r--r-- 3 root supergroup 366313 2016-07-17 20:54 /dir6/test2.txt-rw-r--r-- 3 root supergroup 201566 2016-07-17 20:54 /dir6/test3.txt
先看进程日志:
[root@hadoop33 ~]# jps26501 Jps26279 YarnChild (mapper任务)2399 QuorumPeerMain26280 YarnChild (mapper任务)23800 DataNode23648 NodeManager26133 MRAppMaster[root@hadoop66 ~]# jps22777 DataNode26652 Jps26302 YarnChild (mapper任务)22622 NodeManager此时可以看出,此时mapper任务的执行有两台服务器来执行---分而治之!
具体运行日志:
[root@hadoop11 local]# app1.sh16/07/17 20:55:17 INFO client.RMProxy: Connecting to ResourceManager at hadoop22/10.187.84.51:803216/07/17 20:55:18 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.16/07/17 20:55:18 INFO input.FileInputFormat: Total input paths to process : 316/07/17 20:55:18 INFO mapreduce.JobSubmitter: number of splits:316/07/17 20:55:18 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1468752229715_002116/07/17 20:55:19 INFO impl.YarnClientImpl: Submitted application application_1468752229715_002116/07/17 20:55:19 INFO mapreduce.Job: The url to track the job: http://hadoop22:8088/proxy/application_1468752229715_0021/16/07/17 20:55:19 INFO mapreduce.Job: Running job: job_1468752229715_002116/07/17 20:55:25 INFO mapreduce.Job: Job job_1468752229715_0021 running in uber mode : false16/07/17 20:55:25 INFO mapreduce.Job: map 0% reduce 0%16/07/17 20:55:37 INFO mapreduce.Job: map 1% reduce 0%16/07/17 20:55:40 INFO mapreduce.Job: map 2% reduce 0%16/07/17 20:55:45 INFO mapreduce.Job: map 3% reduce 0%16/07/17 20:55:49 INFO mapreduce.Job: map 4% reduce 0%16/07/17 20:55:54 INFO mapreduce.Job: map 5% reduce 0%16/07/17 20:55:58 INFO mapreduce.Job: map 6% reduce 0%16/07/17 20:56:03 INFO mapreduce.Job: map 7% reduce 0%16/07/17 20:56:07 INFO mapreduce.Job: map 8% reduce 0%16/07/17 20:56:12 INFO mapreduce.Job: map 9% reduce 0%16/07/17 20:56:16 INFO mapreduce.Job: map 10% reduce 0%16/07/17 20:56:20 INFO mapreduce.Job: map 11% reduce 0%16/07/17 20:56:24 INFO mapreduce.Job: map 12% reduce 0%16/07/17 20:56:29 INFO mapreduce.Job: map 13% reduce 0%16/07/17 20:56:33 INFO mapreduce.Job: map 14% reduce 0%16/07/17 20:56:37 INFO mapreduce.Job: map 15% reduce 0%16/07/17 20:56:42 INFO mapreduce.Job: map 16% reduce 0%16/07/17 20:56:47 INFO mapreduce.Job: map 17% reduce 0%16/07/17 20:56:51 INFO mapreduce.Job: map 18% reduce 0%16/07/17 20:56:56 INFO mapreduce.Job: map 19% reduce 0%16/07/17 20:57:00 INFO mapreduce.Job: map 20% reduce 0%16/07/17 20:57:05 INFO mapreduce.Job: map 21% reduce 0%16/07/17 20:57:08 INFO mapreduce.Job: map 22% reduce 0%16/07/17 20:57:13 INFO mapreduce.Job: map 23% reduce 0%16/07/17 20:57:18 INFO mapreduce.Job: map 24% reduce 0%16/07/17 20:57:23 INFO mapreduce.Job: map 25% reduce 0%16/07/17 20:57:27 INFO mapreduce.Job: map 26% reduce 0%16/07/17 20:57:32 INFO mapreduce.Job: map 27% reduce 0%16/07/17 20:57:36 INFO mapreduce.Job: map 28% reduce 0%16/07/17 20:57:41 INFO mapreduce.Job: map 29% reduce 0%16/07/17 20:57:45 INFO mapreduce.Job: map 30% reduce 0%16/07/17 20:57:50 INFO mapreduce.Job: map 31% reduce 0%16/07/17 20:57:54 INFO mapreduce.Job: map 32% reduce 0%16/07/17 20:57:59 INFO mapreduce.Job: map 33% reduce 0%16/07/17 20:58:03 INFO mapreduce.Job: map 34% reduce 0%16/07/17 20:58:08 INFO mapreduce.Job: map 35% reduce 0%16/07/17 20:58:12 INFO mapreduce.Job: map 36% reduce 0%16/07/17 20:58:15 INFO mapreduce.Job: map 37% reduce 0%16/07/17 20:58:20 INFO mapreduce.Job: map 38% reduce 0%16/07/17 20:58:24 INFO mapreduce.Job: map 39% reduce 0%16/07/17 20:58:29 INFO mapreduce.Job: map 40% reduce 0%16/07/17 20:58:33 INFO mapreduce.Job: map 41% reduce 0%16/07/17 20:58:38 INFO mapreduce.Job: map 42% reduce 0%16/07/17 20:58:42 INFO mapreduce.Job: map 43% reduce 0%16/07/17 20:58:47 INFO mapreduce.Job: map 44% reduce 0%16/07/17 20:58:51 INFO mapreduce.Job: map 45% reduce 0%16/07/17 20:58:56 INFO mapreduce.Job: map 46% reduce 0%16/07/17 20:59:00 INFO mapreduce.Job: map 58% reduce 0%16/07/17 20:59:06 INFO mapreduce.Job: map 59% reduce 0%16/07/17 20:59:11 INFO mapreduce.Job: map 59% reduce 11%16/07/17 20:59:15 INFO mapreduce.Job: map 60% reduce 11%16/07/17 20:59:21 INFO mapreduce.Job: map 61% reduce 11%16/07/17 20:59:30 INFO mapreduce.Job: map 62% reduce 11%16/07/17 20:59:39 INFO mapreduce.Job: map 63% reduce 11%16/07/17 20:59:48 INFO mapreduce.Job: map 64% reduce 11%16/07/17 20:59:58 INFO mapreduce.Job: map 65% reduce 11%16/07/17 21:00:04 INFO mapreduce.Job: map 66% reduce 11%16/07/17 21:00:13 INFO mapreduce.Job: map 67% reduce 11%16/07/17 21:00:23 INFO mapreduce.Job: map 68% reduce 11%16/07/17 21:00:26 INFO mapreduce.Job: map 79% reduce 11%16/07/17 21:00:27 INFO mapreduce.Job: map 79% reduce 22%16/07/17 21:00:35 INFO mapreduce.Job: map 80% reduce 22%16/07/17 21:00:59 INFO mapreduce.Job: map 81% reduce 22%16/07/17 21:01:20 INFO mapreduce.Job: map 82% reduce 22%16/07/17 21:01:44 INFO mapreduce.Job: map 83% reduce 22%16/07/17 21:02:08 INFO mapreduce.Job: map 84% reduce 22%16/07/17 21:02:32 INFO mapreduce.Job: map 85% reduce 22%16/07/17 21:02:56 INFO mapreduce.Job: map 86% reduce 22%16/07/17 21:03:17 INFO mapreduce.Job: map 87% reduce 22%16/07/17 21:03:41 INFO mapreduce.Job: map 88% reduce 22%16/07/17 21:04:06 INFO mapreduce.Job: map 89% reduce 22%16/07/17 21:04:15 INFO mapreduce.Job: map 100% reduce 22%16/07/17 21:04:16 INFO mapreduce.Job: map 100% reduce 90%16/07/17 21:04:17 INFO mapreduce.Job: map 100% reduce 100%16/07/17 21:04:17 INFO mapreduce.Job: Job job_1468752229715_0021 completed successfully16/07/17 21:04:17 INFO mapreduce.Job: Counters: 50 File System Counters FILE: Number of bytes read=2892255 FILE: Number of bytes written=6158011 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=10898788 HDFS: Number of bytes written=861195 HDFS: Number of read operations=15 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Killed map tasks=2 Launched map tasks=5 Launched reduce tasks=1 Data-local map tasks=5 Total time spent by all maps in occupied slots (ms)=1417294 Total time spent by all reduces in occupied slots (ms)=313657 Total time spent by all map tasks (ms)=1417294 Total time spent by all reduce tasks (ms)=313657 Total vcore-seconds taken by all map tasks=1417294 Total vcore-seconds taken by all reduce tasks=313657 Total megabyte-seconds taken by all map tasks=1451309056 Total megabyte-seconds taken by all reduce tasks=321184768 Map-Reduce Framework Map input records=51444 Map output records=154332 Map output bytes=2583585 Map output materialized bytes=2892267 Input split bytes=300 Combine input records=0 Combine output records=0 Reduce input groups=51444 Reduce shuffle bytes=2892267 Reduce input records=154332 Reduce output records=51444 Spilled Records=308664 Shuffled Maps =3 Failed Shuffles=0 Merged Map outputs=3 GC time elapsed (ms)=9078 CPU time spent (ms)=1054730 Physical memory (bytes) snapshot=1011130368 Virtual memory (bytes) snapshot=3553914880 Total committed heap usage (bytes)=575209472 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=696040 File Output Format Counters Bytes Written=86119516/07/17 21:04:19 INFO client.RMProxy: Connecting to ResourceManager at hadoop22/10.187.84.51:803216/07/17 21:04:19 WARN mapreduce.JobSubmitter: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.16/07/17 21:04:20 INFO input.FileInputFormat: Total input paths to process : 116/07/17 21:04:20 INFO mapreduce.JobSubmitter: number of splits:116/07/17 21:04:20 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1468752229715_002216/07/17 21:04:20 INFO impl.YarnClientImpl: Submitted application application_1468752229715_002216/07/17 21:04:20 INFO mapreduce.Job: The url to track the job: http://hadoop22:8088/proxy/application_1468752229715_0022/16/07/17 21:04:20 INFO mapreduce.Job: Running job: job_1468752229715_002216/07/17 21:04:27 INFO mapreduce.Job: Job job_1468752229715_0022 running in uber mode : false16/07/17 21:04:27 INFO mapreduce.Job: map 0% reduce 0%16/07/17 21:04:33 INFO mapreduce.Job: map 100% reduce 0%16/07/17 21:04:38 INFO mapreduce.Job: map 100% reduce 100%16/07/17 21:04:38 INFO mapreduce.Job: Job job_1468752229715_0022 completed successfully16/07/17 21:04:38 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=24 FILE: Number of bytes written=186173 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=861298 HDFS: Number of bytes written=12 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=3580 Total time spent by all reduces in occupied slots (ms)=3393 Total time spent by all map tasks (ms)=3580 Total time spent by all reduce tasks (ms)=3393 Total vcore-seconds taken by all map tasks=3580 Total vcore-seconds taken by all reduce tasks=3393 Total megabyte-seconds taken by all map tasks=3665920 Total megabyte-seconds taken by all reduce tasks=3474432 Map-Reduce Framework Map input records=51444 Map output records=1 Map output bytes=16 Map output materialized bytes=24 Input split bytes=103 Combine input records=0 Combine output records=0 Reduce input groups=1 Reduce shuffle bytes=24 Reduce input records=1 Reduce output records=1 Spilled Records=2 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=89 CPU time spent (ms)=2360 Physical memory (bytes) snapshot=435548160 Virtual memory (bytes) snapshot=1775456256 Total committed heap usage (bytes)=310444032 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=861195 File Output Format Counters Bytes Written=12
统计:
精确度:51444 51367CPU time spent (ms)=1054730 (此时看来数据量很小的时候,不太适合分而治之,间接说明了hadoop适合大数据)map tasks=3
总结:MapReduce在处理大数据的时候,会逐渐发挥集群的优势,通过mapper任务的并行处理,提高大数据的处理速度!
1 0
- 如何利用MapReduce的分治策略提高KNN算法的运行速度
- 提高算法的运行速度
- 提高网站运行速度的一些策略
- 如何提高电脑的运行速度
- 如何提高网页的运行速度
- 如何提高FPGA的运行速度
- 如何提高FPGA的运行速度
- 如何利用VS的代码优化和openmp并行计算提高程序运行速度
- Hadoop伪分布安装详解+MapReduce运行原理+基于MapReduce的KNN算法实现
- 提高VS2010的运行速度
- php提高的运行速度
- 提高VS2010的运行速度
- 如何提高GIS应用系统的运行速度
- 如何提高卡西欧DT900, DT930系列的运行速度
- 如何提高《个人助手》的运行速度,减少内存占用!
- Android实践--如何提高Android模拟器的运行速度
- Android实践--如何提高Android模拟器的运行速度
- Android实践--如何提高Android模拟器的运行速度
- iOS中内存管理
- 高级指针
- 逆序问题
- 修改arm开发板ip和uboot的环境变量ip
- 斐波那契序列
- 如何利用MapReduce的分治策略提高KNN算法的运行速度
- 穷举算法搬砖问题
- xUtils3数据库的使用
- POJ 题目分类
- MacOS和iOS开发中异步调用与多线程的区别
- CentOS7安装配置Nexus
- CSS之对字体和文本的修饰
- 山东理工OJ 1201 字符串排序
- nltk词性