scala 实现类似reduceByWindown的效果

来源:互联网 发布:一道打车软件 编辑:程序博客网 时间:2024/05/18 12:39

reduceByKey
spark 的做法,

package com.linewell.streamingimport org.apache.spark.{SparkContext, SparkConf}/**  * Created by ctao on 2015/12/9.  */object WordCount  extends  App{  val conf = new SparkConf().setMaster("local").setAppName("wordcount")  val sc = new SparkContext(conf)  val seq = Seq("hello","world","hello","scala","scala","good")  sc.parallelize(seq).map(x => (x,1)).reduceByKey(_ + _).foreach(println)}

纯scala做法

val seq = Seq("hello","world","hello","scala","scala","good")for((k,v)<- seq.map(x => (x,1)).groupBy(_._1)) yield k -> v.map(_._2).sumseq.map(x => (x,1)).groupBy(_._1).map(x => x._1 -> x._2.map(_._2).sum)

reduceByWindow
在这里也不是完全的reduceByWindow,
源数据:(“a”, 1), (“a”, 1), (“a”, 1), (“b”, 1), (“b”, 1), (“c”, 1), (“c”, 1), (“c”, 1), (“a”, 1), (“b”, 1), (“b”, 1)
目标数据
(a,3)
(b,2)
(c,3)
(a,1)
(b,2)
需求分析:
连续聚类,如果出现连续一致的key,则合并,中间出现中断则作为两个kv对出现
纯scala做法:
利用模式匹配
如果是第一个元素,则构建一个list
如果是一个List和最后一个元素的时候,比较list的最后一个元素的k和最后一个元素的k是否一致,一致则进行value合并

/**  * Created by ctao on 2015/12/8.  */object Test extends App {  val source = Seq(("a", 1), ("a", 1), ("a", 1), ("b", 1), ("b", 1), ("c", 1), ("c", 1), ("c", 1), ("a", 1), ("b", 1), ("b", 1))  source.foldLeft(List[(String,Int)]()) {    case (Nil, t) => List(t)    case (xs,j) => if(xs.last._1 == j._1) {     xs.init :+ (xs.last._1,xs.last._2+j._2)    } else  xs :+ j  }.foreach(println)}
0 0
原创粉丝点击