mahout运行bayes贝叶斯算法步骤和报错分析全过程

来源:互联网 发布:cmd mysql启动命令行 编辑:程序博客网 时间:2024/05/16 07:25
使用mahout里面的bayes算法:
I want to get the Bayes train input data set, so I ran the command below:

mahout org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups

-p /Examples/20news-bydate-train(你下载的20news-bydate文件里面的20news-bydate-train文件)

-o /Examples/bayes-train-input(得到的训练集的文件夹)

-a org.apache.mahout.vectorizer.DefaultAnalyzer

-c UTF-8

文本集预处理,产生训练集
mahout org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups -p examples/20news-bydate/20news-bydate-train/ -o examples/20news-bydate/bayes-train-input -a org.apache.mahout.vectorizer.DefaultAnalyzer -c UTF-8
运行的结果:
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/home/administrator/hadoop-0.20.2
HADOOP_CONF_DIR=/home/administrator/hadoop-0.20.2/conf
MAHOUT-JOB: /home/administrator/hadoop-0.20.2/mahout-distribution-0.6/mahout-examples-0.6-job.jar
13/08/06 15:02:50 WARN driver.MahoutDriver: No org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups.props found on classpath, will use command-line arguments only
13/08/06 15:03:00 INFO driver.MahoutDriver: Program took 10045 ms (Minutes: 0.16741666666666666)
administrator@Master:~/hadoop-0.20.2/mahout-distribution-0.6$

结果让我很高兴的是在mahout安装目录下的examples/20new-bydate/出现了bayes-train-input,而且里面是以20news-bydate-train下的目录名作为文件名(也就是类标签)格式为:类标签 一篇文章的特征(这篇文章的所有单词)

产生测试集:
administrator@Master:~/hadoop-0.20.2/mahout-distribution-0.6$ mahout org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups -p examples/20news-bydate/20news-bydate-test/ -o examples/20news-bydate/bayes-test-input -a org.apache.mahout.vectorizer.DefaultAnalyzer -c UTF-8
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/home/administrator/hadoop-0.20.2
HADOOP_CONF_DIR=/home/administrator/hadoop-0.20.2/conf
MAHOUT-JOB: /home/administrator/hadoop-0.20.2/mahout-distribution-0.6/mahout-examples-0.6-job.jar
13/08/06 15:55:14 WARN driver.MahoutDriver: No org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups.props found on classpath, will use command-line arguments only
13/08/06 15:55:21 INFO driver.MahoutDriver: Program took 6404 ms (Minutes: 0.10673333333333333)
administrator@Master:~/hadoop-0.20.2/mahout-distribution-0.6$
同样在examples/20new-bydate/出现了bayes-test-input


将训练文本集上传到HDFS上:
administrator@Master:~/hadoop-0.20.2$ bin/hadoop fs -put /home/administrator/hadoop-0.20.2/mahout-distribution-0.6/examples/20news-bydate/bayes-test-input bayes-test-input
put: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /user/administrator/bayes-test-input. Name node is in safe mode.
administrator@Master:~/hadoop-0.20.2$
报错,name node 处于安全模式  administrator@Master:~/hadoop-0.20.2$ bin/hadoop dfsadmin -safemode leave 消除安全模式
开始运行hadoop
1.模型训练,已经上传了训练文本集,然后依据训练文本集来训练贝叶斯分类器模型。
解释一下命令:-i 表示训练集的输入路径,HDFS路径。 -o分类模型输出路径 -type 分类器类型,这里使用bayes,可选cbayes -ng n-gram建模的大小,默认为1 -source
 数据源的位置,HDFS或HBase 后面的测试也是一样的。
administrator@Master:~/hadoop-0.20.2/mahout-distribution-0.6$ bin/mahout trainclassifier -i bayes-train-input -o newsmodel -type bayes -ng 1 -source hdfs
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/home/administrator/hadoop-0.20.2
HADOOP_CONF_DIR=/home/administrator/hadoop-0.20.2/conf
MAHOUT-JOB: /home/administrator/hadoop-0.20.2/mahout-distribution-0.6/mahout-examples-0.6-job.jar
13/08/06 16:32:49 INFO bayes.TrainClassifier: Training Bayes Classifier
13/08/06 16:32:51 INFO bayes.BayesDriver: Reading features...
13/08/06 16:32:53 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/08/06 16:32:59 INFO mapred.FileInputFormat: Total input paths to process : 20
13/08/06 16:33:03 INFO mapred.JobClient: Running job: job_201308061621_0001
13/08/06 16:33:04 INFO mapred.JobClient:  map 0% reduce 0%
13/08/06 16:34:02 INFO mapred.JobClient:  map 1% reduce 0%
13/08/06 16:34:05 INFO mapred.JobClient:  map 4% reduce 0%
13/08/06 16:34:08 INFO mapred.JobClient:  map 6% reduce 0%
13/08/06 16:34:11 INFO mapred.JobClient:  map 7% reduce 0%
13/08/06 16:35:03 INFO mapred.JobClient:  map 10% reduce 0%
13/08/06 16:35:51 INFO mapred.JobClient:  map 11% reduce 3%
13/08/06 16:35:54 INFO mapred.JobClient:  map 16% reduce 3%
13/08/06 16:35:57 INFO mapred.JobClient:  map 18% reduce 3%
13/08/06 16:36:36 INFO mapred.JobClient:  map 20% reduce 3%
13/08/06 16:37:02 INFO mapred.JobClient:  map 20% reduce 6%
13/08/06 16:37:12 INFO mapred.JobClient:  map 21% reduce 6%
13/08/06 16:37:15 INFO mapred.JobClient:  map 26% reduce 6%
13/08/06 16:37:18 INFO mapred.JobClient:  map 28% reduce 6%
13/08/06 16:38:00 INFO mapred.JobClient:  map 30% reduce 6%
13/08/06 16:38:27 INFO mapred.JobClient:  map 30% reduce 8%
13/08/06 16:38:36 INFO mapred.JobClient:  map 31% reduce 8%
13/08/06 16:38:39 INFO mapred.JobClient:  map 32% reduce 8%
13/08/06 16:38:45 INFO mapred.JobClient:  map 34% reduce 10%
13/08/06 16:38:48 INFO mapred.JobClient:  map 35% reduce 10%
13/08/06 16:38:54 INFO mapred.JobClient:  map 38% reduce 10%
13/08/06 16:38:57 INFO mapred.JobClient:  map 39% reduce 10%
13/08/06 16:39:37 INFO mapred.JobClient:  map 40% reduce 10%
13/08/06 16:39:46 INFO mapred.JobClient:  map 40% reduce 11%
13/08/06 16:39:52 INFO mapred.JobClient:  map 41% reduce 11%
13/08/06 16:39:55 INFO mapred.JobClient:  map 44% reduce 11%
13/08/06 16:40:10 INFO mapred.JobClient:  map 44% reduce 13%
13/08/06 16:40:16 INFO mapred.JobClient:  map 45% reduce 13%
13/08/06 16:40:19 INFO mapred.JobClient:  map 47% reduce 13%
13/08/06 16:40:22 INFO mapred.JobClient:  map 49% reduce 13%
13/08/06 16:41:04 INFO mapred.JobClient:  map 50% reduce 13%
13/08/06 16:41:07 INFO mapred.JobClient:  map 50% reduce 15%
13/08/06 16:41:13 INFO mapred.JobClient:  map 51% reduce 15%
13/08/06 16:41:16 INFO mapred.JobClient:  map 52% reduce 15%
13/08/06 16:41:19 INFO mapred.JobClient:  map 55% reduce 15%
13/08/06 16:41:25 INFO mapred.JobClient:  map 55% reduce 16%
13/08/06 16:41:34 INFO mapred.JobClient:  map 56% reduce 16%
13/08/06 16:41:37 INFO mapred.JobClient:  map 59% reduce 16%
13/08/06 16:42:11 INFO mapred.JobClient:  map 60% reduce 16%
13/08/06 16:42:14 INFO mapred.JobClient:  map 61% reduce 18%
13/08/06 16:42:17 INFO mapred.JobClient:  map 64% reduce 18%
13/08/06 16:42:29 INFO mapred.JobClient:  map 65% reduce 20%
13/08/06 16:42:32 INFO mapred.JobClient:  map 66% reduce 20%
13/08/06 16:42:35 INFO mapred.JobClient:  map 69% reduce 20%
13/08/06 16:43:13 INFO mapred.JobClient:  map 69% reduce 21%
13/08/06 16:43:16 INFO mapred.JobClient:  map 70% reduce 21%
13/08/06 16:43:19 INFO mapred.JobClient:  map 72% reduce 21%
13/08/06 16:43:22 INFO mapred.JobClient:  map 75% reduce 21%
13/08/06 16:43:28 INFO mapred.JobClient:  map 75% reduce 23%
13/08/06 16:43:31 INFO mapred.JobClient:  map 77% reduce 23%
13/08/06 16:43:34 INFO mapred.JobClient:  map 79% reduce 23%
13/08/06 16:44:07 INFO mapred.JobClient:  map 79% reduce 25%
13/08/06 16:44:10 INFO mapred.JobClient:  map 80% reduce 25%
13/08/06 16:44:16 INFO mapred.JobClient:  map 82% reduce 25%
13/08/06 16:44:19 INFO mapred.JobClient:  map 84% reduce 25%
13/08/06 16:44:25 INFO mapred.JobClient:  map 84% reduce 26%
13/08/06 16:44:28 INFO mapred.JobClient:  map 86% reduce 26%
13/08/06 16:44:32 INFO mapred.JobClient:  map 88% reduce 26%
13/08/06 16:44:35 INFO mapred.JobClient:  map 89% reduce 26%
13/08/06 16:45:11 INFO mapred.JobClient:  map 89% reduce 28%
13/08/06 16:45:14 INFO mapred.JobClient:  map 92% reduce 28%
13/08/06 16:45:17 INFO mapred.JobClient:  map 94% reduce 28%
13/08/06 16:45:20 INFO mapred.JobClient:  map 94% reduce 30%
13/08/06 16:45:23 INFO mapred.JobClient:  map 96% reduce 30%
13/08/06 16:45:26 INFO mapred.JobClient:  map 99% reduce 30%
13/08/06 16:45:29 INFO mapred.JobClient:  map 100% reduce 30%
13/08/06 16:46:02 INFO mapred.JobClient:  map 100% reduce 31%
13/08/06 16:46:11 INFO mapred.JobClient:  map 100% reduce 33%
13/08/06 16:46:51 INFO mapred.JobClient:  map 100% reduce 66%
13/08/06 16:46:54 INFO mapred.JobClient:  map 100% reduce 67%
13/08/06 16:46:57 INFO mapred.JobClient:  map 100% reduce 68%
13/08/06 16:47:00 INFO mapred.JobClient:  map 100% reduce 71%
13/08/06 16:47:03 INFO mapred.JobClient:  map 100% reduce 76%
13/08/06 16:47:06 INFO mapred.JobClient:  map 100% reduce 80%
13/08/06 16:47:09 INFO mapred.JobClient:  map 100% reduce 85%
13/08/06 16:47:12 INFO mapred.JobClient:  map 100% reduce 88%
13/08/06 16:47:15 INFO mapred.JobClient:  map 100% reduce 93%
13/08/06 16:47:18 INFO mapred.JobClient:  map 100% reduce 98%
13/08/06 16:47:24 INFO mapred.JobClient:  map 100% reduce 100%
13/08/06 16:47:29 INFO mapred.JobClient: Job complete: job_201308061621_0001
13/08/06 16:47:29 INFO mapred.JobClient: Counters: 18
13/08/06 16:47:29 INFO mapred.JobClient:   Job Counters
13/08/06 16:47:29 INFO mapred.JobClient:     Launched reduce tasks=1
13/08/06 16:47:29 INFO mapred.JobClient:     Launched map tasks=20
13/08/06 16:47:29 INFO mapred.JobClient:     Data-local map tasks=20
13/08/06 16:47:29 INFO mapred.JobClient:   FileSystemCounters
13/08/06 16:47:29 INFO mapred.JobClient:     FILE_BYTES_READ=98340386
13/08/06 16:47:29 INFO mapred.JobClient:     HDFS_BYTES_READ=16607880
13/08/06 16:47:29 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=152001515
13/08/06 16:47:29 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=36290159
13/08/06 16:47:29 INFO mapred.JobClient:   Map-Reduce Framework
13/08/06 16:47:29 INFO mapred.JobClient:     Reduce input groups=909038
13/08/06 16:47:29 INFO mapred.JobClient:     Combine output records=1514951
13/08/06 16:47:29 INFO mapred.JobClient:     Map input records=11314
13/08/06 16:47:29 INFO mapred.JobClient:     Reduce shuffle bytes=51998274
13/08/06 16:47:29 INFO mapred.JobClient:     Reduce output records=764892
13/08/06 16:47:29 INFO mapred.JobClient:     Spilled Records=4296888
13/08/06 16:47:29 INFO mapred.JobClient:     Map output bytes=208379978
13/08/06 16:47:29 INFO mapred.JobClient:     Map input bytes=16607880
13/08/06 16:47:29 INFO mapred.JobClient:     Combine input records=6471922
13/08/06 16:47:29 INFO mapred.JobClient:     Map output records=6471922
13/08/06 16:47:29 INFO mapred.JobClient:     Reduce input records=1514951
13/08/06 16:47:29 INFO bayes.BayesDriver: Calculating Tf-Idf...
13/08/06 16:47:30 INFO common.BayesTfIdfDriver: Counts of documents in Each Label
13/08/06 16:47:30 INFO common.BayesTfIdfDriver: {rec.motorcycles=598.0, comp.windows.x=593.0, talk.politics.guns=546.0, talk.politics.mideast=564.0, talk.religion.misc=377.0, rec.sport.baseball=597.0, rec.autos=594.0, rec.sport.hockey=600.0, comp.sys.mac.hardware=578.0, comp.sys.ibm.pc.hardware=590.0, sci.space=593.0, talk.politics.misc=465.0, sci.electronics=591.0, comp.graphics=584.0, sci.crypt=595.0, sci.med=594.0, soc.religion.christian=599.0, alt.atheism=480.0, misc.forsale=585.0, comp.os.ms-windows.misc=591.0}
13/08/06 16:47:30 INFO common.BayesTfIdfDriver: {dataSource=hdfs, alpha_i=1.0, minDf=1, gramSize=1}
13/08/06 16:47:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/08/06 16:47:36 INFO mapred.FileInputFormat: Total input paths to process : 3
13/08/06 16:47:37 INFO mapred.JobClient: Running job: job_201308061621_0002
13/08/06 16:47:38 INFO mapred.JobClient:  map 0% reduce 0%
13/08/06 16:48:18 INFO mapred.JobClient:  map 2% reduce 0%
13/08/06 16:48:21 INFO mapred.JobClient:  map 4% reduce 0%
13/08/06 16:48:24 INFO mapred.JobClient:  map 10% reduce 0%
13/08/06 16:48:27 INFO mapred.JobClient:  map 23% reduce 0%
13/08/06 16:48:30 INFO mapred.JobClient:  map 41% reduce 0%
13/08/06 16:48:33 INFO mapred.JobClient:  map 57% reduce 0%
13/08/06 16:48:36 INFO mapred.JobClient:  map 65% reduce 0%
13/08/06 16:48:39 INFO mapred.JobClient:  map 66% reduce 0%
13/08/06 16:50:28 INFO mapred.JobClient:  map 71% reduce 0%
13/08/06 16:50:31 INFO mapred.JobClient:  map 75% reduce 0%
13/08/06 16:50:34 INFO mapred.JobClient:  map 100% reduce 22%
13/08/06 16:50:48 INFO mapred.JobClient:  map 100% reduce 33%
13/08/06 16:50:53 INFO mapred.JobClient:  map 100% reduce 66%
13/08/06 16:50:56 INFO mapred.JobClient:  map 100% reduce 68%
13/08/06 16:50:59 INFO mapred.JobClient:  map 100% reduce 71%
13/08/06 16:51:02 INFO mapred.JobClient:  map 100% reduce 78%
13/08/06 16:51:05 INFO mapred.JobClient:  map 100% reduce 87%
13/08/06 16:51:11 INFO mapred.JobClient:  map 100% reduce 100%
13/08/06 16:51:16 INFO mapred.JobClient: Job complete: job_201308061621_0002
13/08/06 16:51:16 INFO mapred.JobClient: Counters: 18
13/08/06 16:51:16 INFO mapred.JobClient:   Job Counters
13/08/06 16:51:16 INFO mapred.JobClient:     Launched reduce tasks=1
13/08/06 16:51:16 INFO mapred.JobClient:     Launched map tasks=3
13/08/06 16:51:16 INFO mapred.JobClient:     Data-local map tasks=3
13/08/06 16:51:16 INFO mapred.JobClient:   FileSystemCounters
13/08/06 16:51:16 INFO mapred.JobClient:     FILE_BYTES_READ=54811689
13/08/06 16:51:16 INFO mapred.JobClient:     HDFS_BYTES_READ=36289227
13/08/06 16:51:16 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=82217642
13/08/06 16:51:16 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=15719030
13/08/06 16:51:16 INFO mapred.JobClient:   Map-Reduce Framework
13/08/06 16:51:16 INFO mapred.JobClient:     Reduce input groups=310364
13/08/06 16:51:16 INFO mapred.JobClient:     Combine output records=620727
13/08/06 16:51:16 INFO mapred.JobClient:     Map input records=764872
13/08/06 16:51:16 INFO mapred.JobClient:     Reduce shuffle bytes=27405857
13/08/06 16:51:16 INFO mapred.JobClient:     Reduce output records=310364
13/08/06 16:51:16 INFO mapred.JobClient:     Spilled Records=1862180
13/08/06 16:51:16 INFO mapred.JobClient:     Map output bytes=28614848
13/08/06 16:51:16 INFO mapred.JobClient:     Map input bytes=36288930
13/08/06 16:51:16 INFO mapred.JobClient:     Combine input records=764872
13/08/06 16:51:16 INFO mapred.JobClient:     Map output records=764872
13/08/06 16:51:16 INFO mapred.JobClient:     Reduce input records=620727
13/08/06 16:51:16 INFO bayes.BayesDriver: Calculating weight sums for labels and features...
13/08/06 16:51:17 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/08/06 16:51:24 INFO mapred.FileInputFormat: Total input paths to process : 1
13/08/06 16:51:26 INFO mapred.JobClient: Running job: job_201308061621_0003
13/08/06 16:51:27 INFO mapred.JobClient:  map 0% reduce 0%
13/08/06 16:52:04 INFO mapred.JobClient:  map 1% reduce 0%
13/08/06 16:52:07 INFO mapred.JobClient:  map 6% reduce 0%
13/08/06 16:52:10 INFO mapred.JobClient:  map 19% reduce 0%
13/08/06 16:52:13 INFO mapred.JobClient:  map 48% reduce 0%
13/08/06 16:52:16 INFO mapred.JobClient:  map 69% reduce 0%
13/08/06 16:52:19 INFO mapred.JobClient:  map 70% reduce 0%
13/08/06 16:52:40 INFO mapred.JobClient:  map 100% reduce 0%
13/08/06 16:53:04 INFO mapred.JobClient:  map 100% reduce 16%
13/08/06 16:53:07 INFO mapred.JobClient:  map 100% reduce 33%
13/08/06 16:53:10 INFO mapred.JobClient:  map 100% reduce 66%
13/08/06 16:53:13 INFO mapred.JobClient:  map 100% reduce 68%
13/08/06 16:53:16 INFO mapred.JobClient:  map 100% reduce 79%
13/08/06 16:53:22 INFO mapred.JobClient:  map 100% reduce 100%
13/08/06 16:53:27 INFO mapred.JobClient: Job complete: job_201308061621_0003
13/08/06 16:53:27 INFO mapred.JobClient: Counters: 18
13/08/06 16:53:27 INFO mapred.JobClient:   Job Counters
13/08/06 16:53:27 INFO mapred.JobClient:     Launched reduce tasks=1
13/08/06 16:53:27 INFO mapred.JobClient:     Launched map tasks=2
13/08/06 16:53:27 INFO mapred.JobClient:     Data-local map tasks=2
13/08/06 16:53:27 INFO mapred.JobClient:   FileSystemCounters
13/08/06 16:53:27 INFO mapred.JobClient:     FILE_BYTES_READ=11048564
13/08/06 16:53:27 INFO mapred.JobClient:     HDFS_BYTES_READ=15719402
13/08/06 16:53:27 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=16572907
13/08/06 16:53:27 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=4852472
13/08/06 16:53:27 INFO mapred.JobClient:   Map-Reduce Framework
13/08/06 16:53:27 INFO mapred.JobClient:     Reduce input groups=144167
13/08/06 16:53:27 INFO mapred.JobClient:     Combine output records=203280
13/08/06 16:53:27 INFO mapred.JobClient:     Map input records=310363
13/08/06 16:53:27 INFO mapred.JobClient:     Reduce shuffle bytes=2951738
13/08/06 16:53:27 INFO mapred.JobClient:     Reduce output records=144167
13/08/06 16:53:27 INFO mapred.JobClient:     Spilled Records=609840
13/08/06 16:53:27 INFO mapred.JobClient:     Map output bytes=23944888
13/08/06 16:53:27 INFO mapred.JobClient:     Map input bytes=15718807
13/08/06 16:53:27 INFO mapred.JobClient:     Combine input records=931089
13/08/06 16:53:27 INFO mapred.JobClient:     Map output records=931089
13/08/06 16:53:27 INFO mapred.JobClient:     Reduce input records=203280
13/08/06 16:53:27 INFO bayes.BayesDriver: Calculating the weight Normalisation factor for each class...
13/08/06 16:53:27 INFO bayes.BayesThetaNormalizerDriver: Sigma_k for Each Label
13/08/06 16:53:27 INFO bayes.BayesThetaNormalizerDriver: {rec.motorcycles=11052.410414456068, comp.windows.x=9270.849849702043, talk.politics.guns=9887.085461977811, talk.politics.mideast=9921.085799630906, talk.religion.misc=6387.734606501341, rec.sport.baseball=10089.365737565573, rec.autos=10507.829939984464, rec.sport.hockey=9816.652146263117, comp.sys.mac.hardware=9419.518314732954, comp.sys.ibm.pc.hardware=9306.30868647539, sci.space=10982.884216264576, talk.politics.misc=8443.40110713291, sci.electronics=10521.472444424193, comp.graphics=9468.999013523597, sci.crypt=10532.7153495927, sci.med=10870.529167376484, soc.religion.christian=9860.923041518237, alt.atheism=7645.367648715689, misc.forsale=10148.066691014705, comp.os.ms-windows.misc=9237.683674000804}
13/08/06 16:53:27 INFO bayes.BayesThetaNormalizerDriver: Sigma_kSigma_j for each Label and for each Features
13/08/06 16:53:27 INFO bayes.BayesThetaNormalizerDriver: 193370.8833108568
13/08/06 16:53:27 INFO bayes.BayesThetaNormalizerDriver: Vocabulary Count
13/08/06 16:53:27 INFO bayes.BayesThetaNormalizerDriver: 144146.0
13/08/06 16:53:27 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/08/06 16:53:32 INFO mapred.FileInputFormat: Total input paths to process : 1
13/08/06 16:53:34 INFO mapred.JobClient: Running job: job_201308061621_0004
13/08/06 16:53:35 INFO mapred.JobClient:  map 0% reduce 0%
13/08/06 16:54:08 INFO mapred.JobClient:  map 1% reduce 0%
13/08/06 16:54:11 INFO mapred.JobClient:  map 3% reduce 0%
13/08/06 16:54:14 INFO mapred.JobClient:  map 6% reduce 0%
13/08/06 16:54:17 INFO mapred.JobClient:  map 16% reduce 0%
13/08/06 16:54:20 INFO mapred.JobClient:  map 29% reduce 0%
13/08/06 16:54:23 INFO mapred.JobClient:  map 73% reduce 0%
13/08/06 16:54:26 INFO mapred.JobClient:  map 100% reduce 0%
13/08/06 16:54:45 INFO mapred.JobClient:  map 100% reduce 100%
13/08/06 16:54:50 INFO mapred.JobClient: Job complete: job_201308061621_0004
13/08/06 16:54:50 INFO mapred.JobClient: Counters: 18
13/08/06 16:54:50 INFO mapred.JobClient:   Job Counters
13/08/06 16:54:50 INFO mapred.JobClient:     Launched reduce tasks=1
13/08/06 16:54:50 INFO mapred.JobClient:     Launched map tasks=2
13/08/06 16:54:50 INFO mapred.JobClient:     Data-local map tasks=2
13/08/06 16:54:50 INFO mapred.JobClient:   FileSystemCounters
13/08/06 16:54:50 INFO mapred.JobClient:     FILE_BYTES_READ=757
13/08/06 16:54:50 INFO mapred.JobClient:     HDFS_BYTES_READ=15719402
13/08/06 16:54:50 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=1584
13/08/06 16:54:50 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=932
13/08/06 16:54:50 INFO mapred.JobClient:   Map-Reduce Framework
13/08/06 16:54:50 INFO mapred.JobClient:     Reduce input groups=20
13/08/06 16:54:50 INFO mapred.JobClient:     Combine output records=21
13/08/06 16:54:50 INFO mapred.JobClient:     Map input records=310363
13/08/06 16:54:50 INFO mapred.JobClient:     Reduce shuffle bytes=366
13/08/06 16:54:50 INFO mapred.JobClient:     Reduce output records=20
13/08/06 16:54:50 INFO mapred.JobClient:     Spilled Records=42
13/08/06 16:54:50 INFO mapred.JobClient:     Map output bytes=10617979
13/08/06 16:54:50 INFO mapred.JobClient:     Map input bytes=15718807
13/08/06 16:54:50 INFO mapred.JobClient:     Combine input records=310363
13/08/06 16:54:50 INFO mapred.JobClient:     Map output records=310363
13/08/06 16:54:50 INFO mapred.JobClient:     Reduce input records=21
13/08/06 16:54:50 INFO common.HadoopUtil: Deleting newsmodel/trainer-docCount
13/08/06 16:54:50 INFO common.HadoopUtil: Deleting newsmodel/trainer-termDocCount
13/08/06 16:54:50 INFO common.HadoopUtil: Deleting newsmodel/trainer-featureCount
13/08/06 16:54:50 INFO common.HadoopUtil: Deleting newsmodel/trainer-wordFreq
13/08/06 16:54:50 INFO common.HadoopUtil: Deleting newsmodel/trainer-tfIdf/trainer-vocabCount
13/08/06 16:54:50 INFO driver.MahoutDriver: Program took 1321128 ms (Minutes: 22.018816666666666)
administrator@Master:~/hadoop-0.20.2/mahout-distribution-0.6$

2.模型测试:依据上一不训练的贝叶斯分类器模型来进行分类测试,第一,上传,第二,执行分类器测试
administrator@Master:~/hadoop-0.20.2$ bin/hadoop fs -put /home/administrator/hadoop-0.20.2/mahout-distribution-0.6/examples/20news-bydate/bayes-test-input bayes-test-input
administrator@Master:~/hadoop-0.20.2$ cd mahout-distribution-0.6/
administrator@Master:~/hadoop-0.20.2/mahout-distribution-0.6$ bin/mahout testclassifier -m newsmodel -d bayes-test-input -type bayes -ng 1 -source hdfs -method mapreduce


出现了错误
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using HADOOP_HOME=/home/administrator/hadoop-0.20.2
HADOOP_CONF_DIR=/home/administrator/hadoop-0.20.2/conf
MAHOUT-JOB: /home/administrator/hadoop-0.20.2/mahout-distribution-0.6/mahout-examples-0.6-job.jar
13/08/06 17:13:10 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
13/08/06 17:13:13 INFO mapred.FileInputFormat: Total input paths to process : 20
13/08/06 17:13:15 INFO mapred.JobClient: Running job: job_201308061621_0005
13/08/06 17:13:16 INFO mapred.JobClient:  map 0% reduce 0%
13/08/06 17:14:40 INFO mapred.JobClient:  map 1% reduce 0%
13/08/06 17:14:43 INFO mapred.JobClient:  map 2% reduce 0%
13/08/06 17:14:49 INFO mapred.JobClient:  map 3% reduce 0%
13/08/06 17:14:56 INFO mapred.JobClient:  map 4% reduce 0%
13/08/06 17:15:02 INFO mapred.JobClient:  map 5% reduce 0%
13/08/06 17:15:05 INFO mapred.JobClient:  map 6% reduce 0%
13/08/06 17:15:11 INFO mapred.JobClient:  map 7% reduce 0%
13/08/06 17:15:17 INFO mapred.JobClient:  map 8% reduce 0%
13/08/06 17:15:20 INFO mapred.JobClient:  map 9% reduce 0%
13/08/06 17:15:50 INFO mapred.JobClient:  map 9% reduce 3%
13/08/06 17:16:29 INFO mapred.JobClient:  map 10% reduce 3%
13/08/06 17:16:32 INFO mapred.JobClient:  map 11% reduce 3%
13/08/06 17:16:35 INFO mapred.JobClient:  map 13% reduce 3%
13/08/06 17:16:41 INFO mapred.JobClient:  map 14% reduce 3%
13/08/06 17:16:44 INFO mapred.JobClient:  map 15% reduce 3%
13/08/06 17:16:47 INFO mapred.JobClient:  map 17% reduce 3%
13/08/06 17:16:53 INFO mapred.JobClient:  map 19% reduce 3%
13/08/06 17:17:05 INFO mapred.JobClient:  map 19% reduce 6%
13/08/06 17:17:50 INFO mapred.JobClient: Task Id : attempt_201308061621_0005_m_000004_0, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.map.OpenIntObjectHashMap.rehash(OpenIntObjectHashMap.java:420)
    at org.apache.mahout.math.map.OpenIntObjectHashMap.put(OpenIntObjectHashMap.java:384)
    at org.apache.mahout.math.SparseMatrix.setQuick(SparseMatrix.java:103)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.loadFeatureWeight(InMemoryBayesDatastore.java:149)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadWeightMatrix(SequenceFileModelReader.java:64)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:50)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    ... 18 more

13/08/06 17:17:51 INFO mapred.JobClient:  map 21% reduce 6%
13/08/06 17:17:54 INFO mapred.JobClient:  map 22% reduce 6%
13/08/06 17:17:57 INFO mapred.JobClient:  map 23% reduce 6%
13/08/06 17:18:03 INFO mapred.JobClient:  map 24% reduce 6%
13/08/06 17:18:15 INFO mapred.JobClient:  map 24% reduce 8%
13/08/06 17:18:45 INFO mapred.JobClient:  map 26% reduce 8%
13/08/06 17:18:51 INFO mapred.JobClient:  map 27% reduce 8%
13/08/06 17:18:54 INFO mapred.JobClient:  map 28% reduce 8%
13/08/06 17:18:56 INFO mapred.JobClient: Task Id : attempt_201308061621_0005_m_000006_0, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.map.OpenIntObjectHashMap.rehash(OpenIntObjectHashMap.java:420)
    at org.apache.mahout.math.map.OpenIntObjectHashMap.put(OpenIntObjectHashMap.java:384)
    at org.apache.mahout.math.SparseMatrix.setQuick(SparseMatrix.java:103)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.loadFeatureWeight(InMemoryBayesDatastore.java:149)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadWeightMatrix(SequenceFileModelReader.java:64)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:50)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    ... 18 more

13/08/06 17:18:57 INFO mapred.JobClient:  map 29% reduce 8%
13/08/06 17:19:05 INFO mapred.JobClient:  map 29% reduce 10%
13/08/06 17:19:47 INFO mapred.JobClient:  map 32% reduce 10%
13/08/06 17:19:50 INFO mapred.JobClient:  map 33% reduce 10%
13/08/06 17:19:53 INFO mapred.JobClient:  map 34% reduce 10%
13/08/06 17:19:59 INFO mapred.JobClient: Task Id : attempt_201308061621_0005_m_000007_0, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.map.OpenIntObjectHashMap.rehash(OpenIntObjectHashMap.java:419)
    at org.apache.mahout.math.map.OpenIntObjectHashMap.put(OpenIntObjectHashMap.java:384)
    at org.apache.mahout.math.SparseMatrix.setQuick(SparseMatrix.java:103)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.loadFeatureWeight(InMemoryBayesDatastore.java:149)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadWeightMatrix(SequenceFileModelReader.java:64)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:50)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    ... 18 more

13/08/06 17:20:06 INFO mapred.JobClient:  map 34% reduce 11%
13/08/06 17:21:00 INFO mapred.JobClient:  map 35% reduce 11%
13/08/06 17:21:03 INFO mapred.JobClient:  map 36% reduce 11%
13/08/06 17:21:06 INFO mapred.JobClient:  map 37% reduce 11%
13/08/06 17:21:09 INFO mapred.JobClient:  map 38% reduce 11%
13/08/06 17:21:12 INFO mapred.JobClient:  map 39% reduce 11%
13/08/06 17:21:18 INFO mapred.JobClient:  map 40% reduce 11%
13/08/06 17:21:21 INFO mapred.JobClient:  map 43% reduce 11%
13/08/06 17:21:27 INFO mapred.JobClient:  map 44% reduce 11%
13/08/06 17:21:30 INFO mapred.JobClient:  map 44% reduce 13%
13/08/06 17:21:45 INFO mapred.JobClient:  map 44% reduce 15%
13/08/06 17:22:21 INFO mapred.JobClient:  map 45% reduce 15%
13/08/06 17:22:24 INFO mapred.JobClient:  map 46% reduce 15%
13/08/06 17:22:27 INFO mapred.JobClient:  map 47% reduce 15%
13/08/06 17:22:30 INFO mapred.JobClient:  map 48% reduce 15%
13/08/06 17:22:33 INFO mapred.JobClient:  map 49% reduce 15%
13/08/06 17:22:35 INFO mapred.JobClient: Task Id : attempt_201308061621_0005_m_000010_0, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.map.OpenIntObjectHashMap.rehash(OpenIntObjectHashMap.java:420)
    at org.apache.mahout.math.map.OpenIntObjectHashMap.put(OpenIntObjectHashMap.java:384)
    at org.apache.mahout.math.SparseMatrix.setQuick(SparseMatrix.java:103)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.loadFeatureWeight(InMemoryBayesDatastore.java:149)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadWeightMatrix(SequenceFileModelReader.java:64)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:50)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    ... 18 more

13/08/06 17:22:45 INFO mapred.JobClient:  map 49% reduce 16%
13/08/06 17:23:45 INFO mapred.JobClient: Task Id : attempt_201308061621_0005_m_000011_0, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.map.OpenIntObjectHashMap.rehash(OpenIntObjectHashMap.java:419)
    at org.apache.mahout.math.map.OpenIntObjectHashMap.put(OpenIntObjectHashMap.java:384)
    at org.apache.mahout.math.SparseMatrix.setQuick(SparseMatrix.java:103)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.loadFeatureWeight(InMemoryBayesDatastore.java:149)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadWeightMatrix(SequenceFileModelReader.java:64)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:50)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    ... 18 more

13/08/06 17:23:49 INFO mapred.JobClient:  map 50% reduce 16%


接下来的错误都是类似的。
13/08/06 17:23:52 INFO mapred.JobClient:  map 51% reduce 16%
13/08/06 17:24:01 INFO mapred.JobClient:  map 52% reduce 16%
13/08/06 17:24:13 INFO mapred.JobClient:  map 53% reduce 16%
13/08/06 17:24:22 INFO mapred.JobClient:  map 54% reduce 16%
13/08/06 17:24:40 INFO mapred.JobClient:  map 54% reduce 18%
13/08/06 17:24:42 INFO mapred.JobClient: Task Id : attempt_201308061621_0005_m_000011_1, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.map.OpenIntObjectHashMap.rehash(OpenIntObjectHashMap.java:419)
    at org.apache.mahout.math.map.OpenIntObjectHashMap.put(OpenIntObjectHashMap.java:384)
    at org.apache.mahout.math.SparseMatrix.setQuick(SparseMatrix.java:103)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.loadFeatureWeight(InMemoryBayesDatastore.java:149)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadWeightMatrix(SequenceFileModelReader.java:64)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:50)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    ... 18 more

13/08/06 17:25:40 INFO mapred.JobClient: Task Id : attempt_201308061621_0005_m_000012_0, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.map.OpenIntObjectHashMap.rehash(OpenIntObjectHashMap.java:419)
    at org.apache.mahout.math.map.OpenIntObjectHashMap.put(OpenIntObjectHashMap.java:384)
    at org.apache.mahout.math.SparseMatrix.setQuick(SparseMatrix.java:103)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.loadFeatureWeight(InMemoryBayesDatastore.java:149)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadWeightMatrix(SequenceFileModelReader.java:64)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:50)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    ... 18 more

13/08/06 17:25:44 INFO mapred.JobClient:  map 55% reduce 18%
13/08/06 17:25:47 INFO mapred.JobClient:  map 56% reduce 18%
13/08/06 17:25:53 INFO mapred.JobClient:  map 57% reduce 18%
13/08/06 17:26:02 INFO mapred.JobClient:  map 58% reduce 18%
13/08/06 17:26:10 INFO mapred.JobClient:  map 59% reduce 18%
13/08/06 17:26:25 INFO mapred.JobClient:  map 59% reduce 20%
13/08/06 17:26:52 INFO mapred.JobClient:  map 60% reduce 20%
13/08/06 17:26:58 INFO mapred.JobClient:  map 61% reduce 20%
13/08/06 17:27:01 INFO mapred.JobClient:  map 62% reduce 20%
13/08/06 17:27:04 INFO mapred.JobClient:  map 63% reduce 20%
13/08/06 17:27:07 INFO mapred.JobClient:  map 65% reduce 20%
13/08/06 17:27:10 INFO mapred.JobClient:  map 66% reduce 20%
13/08/06 17:27:13 INFO mapred.JobClient:  map 68% reduce 20%
13/08/06 17:27:16 INFO mapred.JobClient:  map 69% reduce 20%
13/08/06 17:27:25 INFO mapred.JobClient:  map 69% reduce 21%
13/08/06 17:27:31 INFO mapred.JobClient:  map 69% reduce 23%
13/08/06 17:28:27 INFO mapred.JobClient: Task Id : attempt_201308061621_0005_m_000014_0, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.map.OpenIntObjectHashMap.rehash(OpenIntObjectHashMap.java:419)
    at org.apache.mahout.math.map.OpenIntObjectHashMap.put(OpenIntObjectHashMap.java:384)
    at org.apache.mahout.math.SparseMatrix.setQuick(SparseMatrix.java:103)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.loadFeatureWeight(InMemoryBayesDatastore.java:149)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadWeightMatrix(SequenceFileModelReader.java:64)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:50)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    ... 18 more

13/08/06 17:28:29 INFO mapred.JobClient:  map 70% reduce 23%
13/08/06 17:28:32 INFO mapred.JobClient:  map 71% reduce 23%
13/08/06 17:28:35 INFO mapred.JobClient:  map 72% reduce 23%
13/08/06 17:28:38 INFO mapred.JobClient:  map 73% reduce 23%
13/08/06 17:28:50 INFO mapred.JobClient:  map 74% reduce 23%
13/08/06 17:29:02 INFO mapred.JobClient:  map 74% reduce 25%
13/08/06 17:29:28 INFO mapred.JobClient: Task Id : attempt_201308061621_0005_m_000014_1, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.map.OpenIntObjectHashMap.rehash(OpenIntObjectHashMap.java:420)
    at org.apache.mahout.math.map.OpenIntObjectHashMap.put(OpenIntObjectHashMap.java:384)
    at org.apache.mahout.math.SparseMatrix.setQuick(SparseMatrix.java:103)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.loadFeatureWeight(InMemoryBayesDatastore.java:149)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadWeightMatrix(SequenceFileModelReader.java:64)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:50)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    ... 18 more

13/08/06 17:29:58 INFO mapred.JobClient: Task Id : attempt_201308061621_0005_m_000016_0, Status : FAILED
java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:354)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 5 more
Caused by: java.lang.RuntimeException: Error in configuring object
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
    at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
    ... 10 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.mahout.math.map.OpenIntObjectHashMap.rehash(OpenIntObjectHashMap.java:419)
    at org.apache.mahout.math.map.OpenIntObjectHashMap.put(OpenIntObjectHashMap.java:384)
    at org.apache.mahout.math.SparseMatrix.setQuick(SparseMatrix.java:103)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.loadFeatureWeight(InMemoryBayesDatastore.java:149)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadWeightMatrix(SequenceFileModelReader.java:64)
    at org.apache.mahout.classifier.bayes.SequenceFileModelReader.loadModel(SequenceFileModelReader.java:50)
    at org.apache.mahout.classifier.bayes.InMemoryBayesDatastore.initialize(InMemoryBayesDatastore.java:72)
    at org.apache.mahout.classifier.bayes.ClassifierContext.initialize(ClassifierContext.java:44)
    at org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesClassifierMapper.configure(BayesClassifierMapper.java:121)
    ... 18 more

13/08/06 17:30:38 INFO mapred.JobClient:  map 75% reduce 25%
13/08/06 17:30:44 INFO mapred.JobClient:  map 76% reduce 25%
13/08/06 17:30:50 INFO mapred.JobClient:  map 77% reduce 25%
13/08/06 17:30:59 INFO mapred.JobClient:  map 78% reduce 25%
13/08/06 17:31:05 INFO mapred.JobClient:  map 79% reduce 25%
13/08/06 17:31:14 INFO mapred.JobClient:  map 80% reduce 25%
13/08/06 17:31:17 INFO mapred.JobClient:  map 80% reduce 26%
13/08/06 17:31:20 INFO mapred.JobClient:  map 81% reduce 26%
13/08/06 17:31:33 INFO mapred.JobClient:  map 82% reduce 26%
13/08/06 17:31:45 INFO mapred.JobClient:  map 83% reduce 26%
13/08/06 17:31:51 INFO mapred.JobClient:  map 84% reduce 26%
13/08/06 17:32:12 INFO mapred.JobClient:  map 84% reduce 28%
13/08/06 17:32:18 INFO mapred.JobClient:  map 85% reduce 28%
13/08/06 17:32:30 INFO mapred.JobClient:  map 86% reduce 28%
13/08/06 17:32:36 INFO mapred.JobClient:  map 87% reduce 28%
13/08/06 17:32:42 INFO mapred.JobClient:  map 88% reduce 28%
13/08/06 17:32:51 INFO mapred.JobClient:  map 89% reduce 28%
13/08/06 17:33:09 INFO mapred.JobClient:  map 90% reduce 28%
13/08/06 17:33:12 INFO mapred.JobClient:  map 90% reduce 30%
13/08/06 17:33:15 INFO mapred.JobClient:  map 91% reduce 30%
13/08/06 17:33:24 INFO mapred.JobClient:  map 92% reduce 30%
13/08/06 17:33:36 INFO mapred.JobClient:  map 93% reduce 30%
13/08/06 17:33:45 INFO mapred.JobClient:  map 94% reduce 30%
13/08/06 17:34:00 INFO mapred.JobClient:  map 97% reduce 30%
13/08/06 17:34:03 INFO mapred.JobClient:  map 98% reduce 31%
13/08/06 17:34:06 INFO mapred.JobClient:  map 100% reduce 31%
13/08/06 17:34:15 INFO mapred.JobClient:  map 100% reduce 100%
13/08/06 17:34:23 INFO mapred.JobClient: Job complete: job_201308061621_0005
13/08/06 17:34:23 INFO mapred.JobClient: Counters: 18
13/08/06 17:34:23 INFO mapred.JobClient:   Job Counters
13/08/06 17:34:23 INFO mapred.JobClient:     Launched reduce tasks=1
13/08/06 17:34:23 INFO mapred.JobClient:     Launched map tasks=30
13/08/06 17:34:23 INFO mapred.JobClient:     Data-local map tasks=30
13/08/06 17:34:23 INFO mapred.JobClient:   FileSystemCounters
13/08/06 17:34:23 INFO mapred.JobClient:     FILE_BYTES_READ=11847
13/08/06 17:34:23 INFO mapred.JobClient:     HDFS_BYTES_READ=421846029
13/08/06 17:34:23 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=24448
13/08/06 17:34:23 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=13440
13/08/06 17:34:23 INFO mapred.JobClient:   Map-Reduce Framework
13/08/06 17:34:23 INFO mapred.JobClient:     Reduce input groups=230
13/08/06 17:34:23 INFO mapred.JobClient:     Combine output records=230
13/08/06 17:34:23 INFO mapred.JobClient:     Map input records=7532
13/08/06 17:34:23 INFO mapred.JobClient:     Reduce shuffle bytes=11961
13/08/06 17:34:23 INFO mapred.JobClient:     Reduce output records=230
13/08/06 17:34:23 INFO mapred.JobClient:     Spilled Records=460
13/08/06 17:34:23 INFO mapred.JobClient:     Map output bytes=377368
13/08/06 17:34:23 INFO mapred.JobClient:     Map input bytes=10399829
13/08/06 17:34:23 INFO mapred.JobClient:     Combine input records=7532
13/08/06 17:34:23 INFO mapred.JobClient:     Map output records=7532
13/08/06 17:34:23 INFO mapred.JobClient:     Reduce input records=230
13/08/06 17:34:24 INFO bayes.BayesClassifierDriver: =======================================================
Confusion Matrix
-------------------------------------------------------
a        b        c        d        e        f        g        h        i        j        k        l        m        n        o        p        q        r        s        t        <--Classified as
381      0        0        0        0        9        1        0        0        0        1        0        0        2        0        1        0        0        3        0         |  398       a     = rec.motorcycles
1        284      0        0        0        0        1        0        6        3        11       0        66       3        0        1        6        0        4        9         |  395       b     = comp.windows.x
2        0        339      2        0        3        5        1        0        0        0        0        1        1        12       1        7        0        2        0         |  376       c     = talk.politics.mideast
4        0        1        327      0        2        2        0        0        2        1        1        0        5        1        4        12       0        2        0         |  364       d     = talk.politics.guns
7        0        4        32       27       7        7        2        0        12       0        0        6        0        100      9        7        31       0        0         |  251       e     = talk.religion.misc
10       0        0        0        0        359      2        2        0        1        3        0        1        6        0        1        0        0        11       0         |  396       f     = rec.autos
0        0        0        0        0        1        383      9        1        0        0        0        0        0        0        0        0        0        3        0         |  397       g     = rec.sport.baseball
1        0        0        0        0        0        9        382      0        0        0        0        1        1        1        0        2        0        2        0         |  399       h     = rec.sport.hockey
2        0        0        0        0        4        3        0        330      4        4        0        5        12       0        0        2        0        12       7         |  385       i     = comp.sys.mac.hardware
0        3        0        0        0        0        1        0        0        368      0        0        10       4        1        3        2        0        2        0         |  394       j     = sci.space
0        0        0        0        0        3        1        0        27       2        291      0        11       25       0        0        1        0        13       18        |  392       k     = comp.sys.ibm.pc.hardware
8        0        1        109      0        6        11       4        1        18       0        98       1        3        11       10       27       1        1        0         |  310       l     = talk.politics.misc
0        11       0        0        0        3        6        0        10       6        11       0        299      13       0        2        13       0        7        8         |  389       m     = comp.graphics
6        0        1        0        0        4        2        0        5        2        12       0        8        321      0        4        14       0        8        6         |  393       n     = sci.electronics
2        0        0        0        0        0        4        1        0        3        1        0        3        1        372      6        0        2        1        2         |  398       o     = soc.religion.christian
4        0        0        1        0        2        3        3        0        4        2        0        7        12       6        342      1        0        9        0         |  396       p     = sci.med
0        1        0        1        0        1        4        0        3        0        1        0        8        4        0        2        369      0        1        1         |  396       q     = sci.crypt
10       0        4        10       1        5        6        2        2        6        2        0        2        1        86       15       14       152      0        1         |  319       r     = alt.atheism
4        0        0        0        0        9        1        1        8        1        12       0        3        6        0        2        0        0        341      2         |  390       s     = misc.forsale
8        5        0        0        0        1        6        0        8        5        50       0        40       2        1        0        9        0        3        256       |  394       t     = comp.os.ms-windows.misc


13/08/06 17:34:24 INFO driver.MahoutDriver: Program took 1276870 ms (Minutes: 21.281166666666667)
administrator@Master:~/hadoop-0.20.2/mahout-distribution-0.6$


虽然报错了,但结果与apache官网里面的一致:https://cwiki.apache.org/confluence/display/MAHOUT/Twenty+Newsgroups 运行mahout贝叶斯的步骤。
=======================================================
Confusion Matrix
-------------------------------------------------------
a   b   c   d   e   f   g   h   i   j   k   l   m   n   o   p   q   r   s   t   u   <--Classified as
381 0   0   0   0   9   1   0   0   0   1   0   0   2   0   1   0   0   3   0   0    |  398  a     = rec.motorcycles
1   284 0   0   0   0   1   0   6   3   11  0   66  3   0   1   6   0   4   9   0    |  395  b     = comp.windows.x
2   0   339 2   0   3   5   1   0   0   0   0   1   1   12  1   7   0   2   0   0    |  376  c     = talk.politics.mideast
4   0   1   327 0   2   2   0   0   2   1   1   0   5   1   4   12  0   2   0   0    |  364  d     = talk.politics.guns
7   0   4   32  27  7   7   2   0   12  0   0   6   0   100 9   7   31  0   0   0    |  251  e     = talk.religion.misc
10  0   0   0   0   359 2   2   0   1   3   0   1   6   0   1   0   0   11  0   0    |  396  f     = rec.autos
0   0   0   0   0   1   383 9   1   0   0   0   0   0   0   0   0   0   3   0   0    |  397  g     = rec.sport.baseball
1   0   0   0   0   0   9   382 0   0   0   0   1   1   1   0   2   0   2   0   0    |  399  h     = rec.sport.hockey
2   0   0   0   0   4   3   0   330 4   4   0   5   12  0   0   2   0   12  7   0    |  385  i     = comp.sys.mac.hardware
0   3   0   0   0   0   1   0   0   368 0   0   10  4   1   3   2   0   2   0   0    |  394  j     = sci.space
0   0   0   0   0   3   1   0   27  2   291 0   11  25  0   0   1   0   13  18  0    |  392  k     = comp.sys.ibm.pc.hardware
8   0   1   109 0   6   11  4   1   18  0   98  1   3   11  10  27  1   1   0   0    |  310  l     = talk.politics.misc
0   11  0   0   0   3   6   0   10  6   11  0   299 13  0   2   13  0   7   8   0    |  389  m     = comp.graphics
6   0   1   0   0   4   2   0   5   2   12  0   8   321 0   4   14  0   8   6   0    |  393  n     = sci.electronics
2   0   0   0   0   0   4   1   0   3   1   0   3   1   372 6   0   2   1   2   0    |  398  o     = soc.religion.christian
4   0   0   1   0   2   3   3   0   4   2   0   7   12  6   342 1   0   9   0   0    |  396  p     = sci.med
0   1   0   1   0   1   4   0   3   0   1   0   8   4   0   2   369 0   1   1   0    |  396  q     = sci.crypt
10  0   4   10  1   5   6   2   2   6   2   0   2   1   86  15  14  152 0   1   0    |  319  r     = alt.atheism
4   0   0   0   0   9   1   1   8   1   12  0   3   6   0   2   0   0   341 2   0    |  390  s     = misc.forsale
8   5   0   0   0   1   6   0   8   5   50  0   40  2   1   0   9   0   3   256 0    |  394  t     = comp.os.ms-windows.misc
0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0    |  0    u     = unknown

对于报错:看http://blog.163.com/jiayouweijiewj@126/blog/static/171232177201011411534050/解决的很彻底。
出错原因:java.lang.OutOfMemoryError: Java heap space内存溢出
解决方法:在hadoop-0.20.2/conf/mapred-site.xml里设置一下hadoop的mapred.child.java.opts,将其设定为你计算机内存的一半就可以啦,(linux查看内存的命令是:free -m)我的是1G的,设成512M,<name>mapred.child.java.opts</name>

<value>-Xmx512M</value>,没设之前查看日志我的默认是200M。再有种方法是换个大点的内存。

配置好了后,把之前测试的运行结果从HDFS中先删除掉,然后我重启hadoop,再次运行

结果与之前的保持一致

原创粉丝点击