GraphX 实现K-Core

来源:互联网 发布:大唐 数据所 副所长 编辑:程序博客网 时间:2024/06/05 00:26

背景

graphx实现k-core比较简单,参考淘宝技术部之前的文章,已经给出了一个代码片段,基本上改改就可以定制自己的需求了。

code

import org.apache.spark._import org.apache.spark.graphx._import org.apache.spark.storage.StorageLevelimport org.apache.spark.graphx.lib._// load the graphval friendsGraph = GraphLoader.edgeListFile(sc, "data/friends.txt.", false, 512, StorageLevel.MEMORY_ONLY, StorageLevel.DISK_ONLY)var degreeGraph = friendsGraph.outerJoinVertices(friendsGraph.degrees) {  (vid, vd, degree) => degree.getOrElse(0)}.cache()val kNum = 200var lastVerticeNum: Long = degreeGraph.numVerticesvar thisVerticeNum: Long = -1var isConverged = falseval maxIter = 10var i = 1while (!isConverged && i <= maxIter) {  val subGraph = degreeGraph.subgraph(    vpred = (vid, degree) => degree >= kNum  ).cache()  degreeGraph = subGraph.outerJoinVertices(subGraph.degrees) {    (vid, vd, degree) => degree.getOrElse(0)  }.cache()  thisVerticeNum = degreeGraph.numVertices  if (lastVerticeNum == thisVerticeNum) {    isConverged = true    println("vertice num is " + thisVerticeNum + ", iteration is " + i)  } else {    println("lastVerticeNum is " + lastVerticeNum + ", thisVerticeNum is " + thisVerticeNum + ", iteration is " + i + ", not converge")    lastVerticeNum = thisVerticeNum  }  i += 1} // do something to degreeGraph

拼的主要是子图的计算速度。

全文完 :)

0 0