Mahout---随机森林算法示例

来源:互联网 发布:什么软件开发 编辑:程序博客网 时间:2024/05/16 07:25

Step 1:
准备样本集,由于数据太多了,所以只截取部分数据

1,1.52101,13.64,4.49,1.10,71.78,0.06,8.75,0.00,0.00,12,1.51761,13.89,3.60,1.36,72.73,0.48,7.83,0.00,0.00,13,1.51618,13.53,3.55,1.54,72.99,0.39,7.78,0.00,0.00,14,1.51766,13.21,3.69,1.29,72.61,0.57,8.22,0.00,0.00,15,1.51742,13.27,3.62,1.24,73.08,0.55,8.07,0.00,0.00,16,1.51596,12.79,3.61,1.62,72.97,0.64,8.07,0.00,0.26,17,1.51743,13.30,3.60,1.14,73.09,0.58,8.17,0.00,0.00,18,1.51756,13.15,3.61,1.05,73.24,0.57,8.24,0.00,0.00,19,1.51918,14.04,3.58,1.37,72.08,0.56,8.30,0.00,0.00,110,1.51755,13.00,3.60,1.36,72.99,0.57,8.40,0.00,0.11,111,1.51571,12.72,3.46,1.56,73.20,0.67,8.09,0.00,0.24,112,1.51763,12.80,3.66,1.27,73.01,0.60,8.56,0.00,0.00,113,1.51589,12.88,3.43,1.40,73.28,0.69,8.05,0.00,0.24,114,1.51748,12.86,3.56,1.27,73.21,0.54,8.38,0.00,0.17,115,1.51763,12.61,3.59,1.31,73.29,0.58,8.50,0.00,0.00,116,1.51761,12.81,3.54,1.23,73.24,0.58,8.39,0.00,0.00,117,1.51784,12.68,3.67,1.16,73.11,0.61,8.70,0.00,0.00,118,1.52196,14.36,3.85,0.89,71.36,0.15,9.15,0.00,0.00,119,1.51911,13.90,3.73,1.18,72.12,0.06,8.89,0.00,0.00,120,1.51735,13.02,3.54,1.69,72.73,0.54,8.44,0.00,0.07,121,1.51750,12.82,3.55,1.49,72.75,0.54,8.52,0.00,0.19,122,1.51966,14.77,3.75,0.29,72.02,0.03,9.00,0.00,0.00,123,1.51736,12.78,3.62,1.29,72.79,0.59,8.70,0.00,0.00,1

Step 2:
在node11节点上执行命令,建立样本文件

vi /opt/apps/mahout/apache-mahout-distribution-0.10.2/test/glass.dat

这里写图片描述

Step 3:
分别在三个节点执行命令,启动zookeeper

zkServer.sh startzkServer.sh status

这里写图片描述
这里写图片描述
这里写图片描述

Step 4:
在node11节点执行命令,启动HDFS和Yarn

start-all.sh

这里写图片描述

在node12节点执行命令,启动resourcemanager

yarn-daemon.sh start resourcemanager

这里写图片描述

Step 5:
打开浏览器,输入URL查看HDFS
192.168.80.11:50070
192.168.80.12:50070
这里写图片描述
这里写图片描述

打开浏览器,输入URL查看Yarn
192.168.80.11:8088
192.168.80.12:8088
这里写图片描述
这里写图片描述

Step 6:
在node11节点执行命令,建立文件夹,并将样本集上传到HDFS中

hadoop fs -mkdir randomforesthadoop fs -put /opt/apps/mahout/apache-mahout-distribution-0.10.2/test/glass.dat randomforesthadoop fs -ls randomforest

这里写图片描述

Step 7:
在node11节点执行命令,启动Mahout,生成文件描述

mahout org.apache.mahout.classifier.df.tools.Describe -p randomforest/glass.dat -f randomforest/glass.info -d I 9 N L

这里写图片描述

Step 8:
在node11节点执行命令,使用Mahout进行训练数据

mahout org.apache.mahout.classifier.df.mapreduce.BuildForest -d randomforest/glass.dat -ds randomforest/glass.info -sl 3 -t 5 -o randomforest/forest_result

这里写图片描述

Step 9:
在node11节点上执行命令,使用Mahout进行测试数据

mahout org.apache.mahout.classifier.df.mapreduce.TestForest -i randomforest/glass.dat -ds randomforest/glass.info -m randomforest/forest_result -a -o predictions

这里写图片描述

原创粉丝点击