K 聚类分析实现类源码
来源:互联网 发布:holi天气数据来源 编辑:程序博客网 时间:2024/05/16 13:49
数据文件来自 :http://archive.ics.uci.edu/ml/datasets/Wholesale+customers?cm_mc_uid=21918109261714715776095&cm_mc_sid_50200000=1476090999
import org.apache.spark.{SparkContext, SparkConf}import org.apache.spark.mllib.clustering.{KMeans, KMeansModel}import org.apache.spark.mllib.linalg.Vectorsobject KMeansClustering { def main (args: Array[String]) { if (args.length < 5) { println("Usage:KMeansClustering trainingDataFilePath testDataFilePath numClusters numIterations runTimes") sys.exit(1) } val conf = new SparkConf().setAppName("Spark MLlib Exercise:K-Means Clustering") val sc = new SparkContext(conf) val rawTrainingData = sc.textFile(args(0)) val parsedTrainingData = rawTrainingData.filter(!isColumnNameLine(_)).map(line => { Vectors.dense(line.split("\t").map(_.trim).filter(!"".equals(_)).map(_.toDouble)) }).cache() // Cluster the data into two classes using KMeans val numClusters = args(2).toInt val numIterations = args(3).toInt val runTimes = args(4).toInt var clusterIndex:Int = 0 val clusters:KMeansModel = KMeans.train(parsedTrainingData, numClusters, numIterations,runTimes) println("Cluster Number:" + clusters.clusterCenters.length) println("Cluster Centers Information Overview:") clusters.clusterCenters.foreach( x => { println("Center Point of Cluster " + clusterIndex + ":") println(x) clusterIndex += 1 }) //begin to check which cluster each test data belongs to based on the clustering result val rawTestData = sc.textFile(args(1)) val parsedTestData = rawTestData.map(line => { Vectors.dense(line.split("\t").map(_.trim).filter(!"".equals(_)).map(_.toDouble)) }) parsedTestData.collect().foreach(testDataLine => { val predictedClusterIndex: Int = clusters.predict(testDataLine) println("The data " + testDataLine.toString + " belongs to cluster " + predictedClusterIndex) }) println("Spark MLlib K-means clustering test finished.") } private def isColumnNameLine(line:String):Boolean = { if (line != null && line.contains("Channel")) true else false }
0 0
- K 聚类分析实现类源码
- k-means聚类分析的C++实现
- 聚类分析的K均值算法(Python实现)
- K-Means聚类分析及其Python实现
- K-means聚类分析与python实现
- 聚类分析及k均值聚类
- python 聚类分析实战案例:K-means算法(原理源码)
- 机器学习Chapter3-(聚类分析)Python实现K-Means算法
- K-均值聚类分析
- K-均值聚类分析
- 聚类分析--k中心点算法
- 聚类分析--k中心点算法
- 聚类分析: k-means算法
- K-means空间聚类分析
- 聚类分析层次聚类及k-means算法
- 数据挖掘-聚类分析:k-平均(k-Means)算法实现(C++)
- 聚类分析-实现亚洲足球聚类
- 聚类分析之K-meas算法
- 浏览器如何访问最新js文件
- //选择完下拉框后加载数据
- 配置vim+NERDTree+ Source Explorer+ Tag List
- 【BZOJ2221】面试的考验,随机数列+线段树+离线
- leetcode解题报告:56. Merge Intervals
- K 聚类分析实现类源码
- 单例的智能指针实现
- 项目中友盟推送适配iOS10
- 使用Spring Boot快速构建应用
- SQL Server复制系列1 – 事务复制中的snapshot
- angularjs学习心得
- html php 重定向 跳转 刷新
- gnuplot绘制文氏图
- 网络连接状态