Spark中组件Mllib的学习7之ALS隐式转换训练的model来预测数据
来源:互联网 发布:詹姆斯哈登联盟数据 编辑:程序博客网 时间:2024/04/30 01:34
更多代码请见:https://github.com/xubo245/SparkLearning
1解释
使用隐式转换的来进行推荐,感觉有问题
2.代码:
package apache.spark.mllib.learning.recommendimport java.text.SimpleDateFormatimport java.util.Dateimport org.apache.spark.mllib.recommendation.{ALS, MatrixFactorizationModel, Rating}import org.apache.spark.{SparkConf, SparkContext}/** * Created by xubo on 2016/5/16. */object ALSImplicitFromSpark { def main(args: Array[String]) { val conf = new SparkConf().setMaster("local").setAppName(this.getClass().getSimpleName().filter(!_.equals('$'))) // println(this.getClass().getSimpleName().filter(!_.equals('$'))) //设置环境变量 val sc = new SparkContext(conf) // Load and parse the data // val data = sc.textFile("data/mllib/als/test.data") val data = sc.textFile("file/data/mllib/input/test.data") val ratings = data.map(_.split(',') match { case Array(user, item, rate) => Rating(user.toInt, item.toInt, rate.toDouble) }) // Build the recommendation model using ALS val rank = 10 val numIterations = 10 val alpha = 0.01 val lambda = 0.01 val model = ALS.trainImplicit(ratings, rank, numIterations, lambda, alpha) // val model = ALS.train(ratings, rank, numIterations, 0.01) // Evaluate the model on rating data val usersProducts = ratings.map { case Rating(user, product, rate) => (user, product) } val predictions = model.predict(usersProducts).map { case Rating(user, product, rate) => ((user, product), rate) } val ratesAndPreds = ratings.map { case Rating(user, product, rate) => ((user, product), rate) }.join(predictions) val MSE = ratesAndPreds.map { case ((user, product), (r1, r2)) => val err = (r1 - r2) err * err }.mean() println("Mean Squared Error = " + MSE) // Save and load model val iString = new SimpleDateFormat("yyyyMMddHHmmssSSS").format(new Date()) model.save(sc, "myModelPath" + iString) val sameModel = MatrixFactorizationModel.load(sc, "myModelPath"+iString) /** * recommend */ val rs =sameModel.recommendProducts(2,1) rs.foreach(println) }}
3.结果:
D:\1win7\java\jdk\bin\java -Didea.launcher.port=7533 "-Didea.launcher.bin.path=D:\1win7\idea\IntelliJ IDEA Community Edition 15.0.4\bin" -Dfile.encoding=UTF-8 -classpath "D:\all\idea\SparkLearning\target\classes;D:\1win7\java\jdk\jre\lib\charsets.jar;D:\1win7\java\jdk\jre\lib\deploy.jar;D:\1win7\java\jdk\jre\lib\ext\access-bridge-64.jar;D:\1win7\java\jdk\jre\lib\ext\dnsns.jar;D:\1win7\java\jdk\jre\lib\ext\jaccess.jar;D:\1win7\java\jdk\jre\lib\ext\localedata.jar;D:\1win7\java\jdk\jre\lib\ext\sunec.jar;D:\1win7\java\jdk\jre\lib\ext\sunjce_provider.jar;D:\1win7\java\jdk\jre\lib\ext\sunmscapi.jar;D:\1win7\java\jdk\jre\lib\ext\zipfs.jar;D:\1win7\java\jdk\jre\lib\javaws.jar;D:\1win7\java\jdk\jre\lib\jce.jar;D:\1win7\java\jdk\jre\lib\jfr.jar;D:\1win7\java\jdk\jre\lib\jfxrt.jar;D:\1win7\java\jdk\jre\lib\jsse.jar;D:\1win7\java\jdk\jre\lib\management-agent.jar;D:\1win7\java\jdk\jre\lib\plugin.jar;D:\1win7\java\jdk\jre\lib\resources.jar;D:\1win7\java\jdk\jre\lib\rt.jar;D:\1win7\scala;D:\1win7\scala\lib;D:\1win7\java\otherJar\spark-assembly-1.5.2-hadoop2.6.0.jar;D:\1win7\java\otherJar\adam-apis_2.10-0.18.3-SNAPSHOT.jar;D:\1win7\java\otherJar\adam-cli_2.10-0.18.3-SNAPSHOT.jar;D:\1win7\java\otherJar\adam-core_2.10-0.18.3-SNAPSHOT.jar;D:\1win7\java\otherJar\SparkCSV\com.databricks_spark-csv_2.10-1.4.0.jar;D:\1win7\java\otherJar\SparkCSV\com.univocity_univocity-parsers-1.5.1.jar;D:\1win7\java\otherJar\SparkCSV\org.apache.commons_commons-csv-1.1.jar;D:\1win7\java\otherJar\SparkAvro\spark-avro_2.10-2.0.1.jar;D:\1win7\java\otherJar\SparkAvro\spark-avro_2.10-2.0.1-javadoc.jar;D:\1win7\java\otherJar\SparkAvro\spark-avro_2.10-2.0.1-sources.jar;D:\1win7\java\otherJar\avro\spark-avro_2.10-2.0.2-SNAPSHOT.jar;D:\1win7\java\otherJar\tachyon\tachyon-assemblies-0.7.1-jar-with-dependencies.jar;D:\1win7\scala\lib\scala-actors-migration.jar;D:\1win7\scala\lib\scala-actors.jar;D:\1win7\scala\lib\scala-library.jar;D:\1win7\scala\lib\scala-reflect.jar;D:\1win7\scala\lib\scala-swing.jar;C:\Users\xubo\.m2\repository\com\github\scopt\scopt_2.10\3.2.0\scopt_2.10-3.2.0.jar;C:\Users\xubo\.m2\repository\org\scala-lang\scala-library\2.10.3\scala-library-2.10.3.jar;D:\1win7\idea\IntelliJ IDEA Community Edition 15.0.4\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain apache.spark.mllib.learning.recommend.ALSImplicitFromSparkSLF4J: Class path contains multiple SLF4J bindings.SLF4J: Found binding in [jar:file:/D:/1win7/java/otherJar/spark-assembly-1.5.2-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/D:/1win7/java/otherJar/adam-cli_2.10-0.18.3-SNAPSHOT.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: Found binding in [jar:file:/D:/1win7/java/otherJar/tachyon/tachyon-assemblies-0.7.1-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]2016-05-16 22:55:37 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable2016-05-16 22:55:40 WARN MetricsSystem:71 - Using default name DAGScheduler for source because spark.app.id is not set.2016-05-16 22:55:43 WARN :139 - Your hostname, xubo-PC resolves to a loopback/non-reachable address: fe80:0:0:0:200:5efe:ca26:541d%20, but we couldn't find any external IP address!2016-05-16 22:55:45 WARN BLAS:61 - Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS2016-05-16 22:55:45 WARN BLAS:61 - Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS2016-05-16 22:55:45 WARN LAPACK:61 - Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK2016-05-16 22:55:45 WARN LAPACK:61 - Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACKMean Squared Error = 8.013749322020441SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".SLF4J: Defaulting to no-operation (NOP) logger implementationSLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.2016-05-16 22:55:52 WARN ParquetRecordReader:193 - Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl2016-05-16 22:55:52 WARN MatrixFactorizationModel:71 - User factor does not have a partitioner. Prediction on individual records could be slow.2016-05-16 22:55:52 WARN MatrixFactorizationModel:71 - User factor is not cached. Prediction could be slow.2016-05-16 22:55:53 WARN ParquetRecordReader:193 - Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl2016-05-16 22:55:53 WARN MatrixFactorizationModel:71 - Product factor does not have a partitioner. Prediction on individual records could be slow.2016-05-16 22:55:53 WARN MatrixFactorizationModel:71 - Product factor is not cached. Prediction could be slow.2016-05-16 22:55:53 WARN ParquetRecordReader:193 - Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl2016-05-16 22:55:53 WARN ParquetRecordReader:193 - Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImplRating(2,1,0.9965656182394096)Process finished with exit code 0
参考
【1】http://spark.apache.org/docs/1.5.2/mllib-guide.html
【2】http://spark.apache.org/docs/1.5.2/mllib-collaborative-filtering.html#collaborative-filtering
【3】https://github.com/xubo245/SparkLearning
0 0
- Spark中组件Mllib的学习7之ALS隐式转换训练的model来预测数据
- Spark中组件Mllib的学习8之ALS训练的model来预测数据
- Spark中组件Mllib的学习9之ALS训练的model来预测数据的准确率研究
- Spark中组件Mllib的学习6之ALS测试(apache spark 含隐式转换)
- Spark中组件Mllib的学习10之修改MovieLens来对movieLen中的100k数据进行预测
- Spark中组件Mllib的学习5之ALS测试(apache spark)
- Spark中组件Mllib的学习26之逻辑回归-简单数据集,带预测
- <转>Spark中组件Mllib的学习26之逻辑回归-简单数据集,带预测
- Spark中组件Mllib的学习11之使用ALS对movieLens中一百万条(1M)数据集进行训练,并对输入的新用户数据进行电影推荐
- Spark中组件Mllib的学习27之逻辑回归-多元逻辑回归,较大数据集,带预测准确度计算
- Spark中组件Mllib的学习19之分层抽样
- Spark中组件Mllib的学习24之线性回归1-小数据集
- Spark中组件Mllib的学习25之线性回归2-较大数据集(多元)
- spark mllib机器学习之六 ALS
- 如何解释spark mllib中ALS算法的原理?
- Spark中组件Mllib的学习17之colStats:以列为基础计算统计量的基本数据
- 分享Spark MLlib训练的广告点击率预测模型
- Spark中组件Mllib的学习18之corr:两组数据相关关系计算(Pearson、Spearman)
- jquery mobile的表单元素
- 基于JAVA反射的Spring原理----例子
- LeetCode: Best Time to Buy and Sell Stock
- 如何自定义圆弧按钮?
- Android 图片合成Xfermode示例
- Spark中组件Mllib的学习7之ALS隐式转换训练的model来预测数据
- java32java当中的IO(一)
- 学习路线
- 学习SpringMVC(七)之处理模型数据
- Linux常用命令
- #码神心得_02# java基础知识
- display:inline、block、inline-block的区别
- UML相关文章
- android开发笔记之 仿优酷圆形菜单