spark中join和group操作
来源:互联网 发布:linux查看系统字体设置 编辑:程序博客网 时间:2024/05/20 11:20
package com.scala
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD.rddToPairRDDFunctions
/**
* scala测试join和cogroup:join是排列组合,而cgroup是分组
*/
object JoinAndCogroup {
def main(args:Array[String]):Unit={
val conf=new SparkConf().setAppName("joinAndcogroup").setMaster("local[1]")
//获取context
val sc=new SparkContext(conf)
//创建泛型集合
val stuList=List((1,"tom"),(2,"jim"),(3,"cassie"))
val scoreList=List((1,20),(1,90),(1,30),(2,23),(2,23),(2,80),(3,90),(3,100),(3,100))
//转化为RDD
val stuRDD=sc.parallelize(stuList)
val scoreRDD=sc.parallelize(scoreList)
/* //join操作
//遍历
val joinRDD=stuRDD.join(scoreRDD)
for( join2 <- joinRDD ) {
println("===========")
println("id is "+join2._1)
println("name is "+join2._2._1)
println("score is "+join2._2._2)
} */
//cogroup操作
val gourpRDD=stuRDD.cogroup(scoreRDD)
for (group2<- gourpRDD){
println("===========")
println("id is "+group2._1)
println("name is "+group2._2._1)
println("score is "+group2._2._2)
}
//遍历结果
}
}
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.rdd.RDD.rddToPairRDDFunctions
/**
* scala测试join和cogroup:join是排列组合,而cgroup是分组
*/
object JoinAndCogroup {
def main(args:Array[String]):Unit={
val conf=new SparkConf().setAppName("joinAndcogroup").setMaster("local[1]")
//获取context
val sc=new SparkContext(conf)
//创建泛型集合
val stuList=List((1,"tom"),(2,"jim"),(3,"cassie"))
val scoreList=List((1,20),(1,90),(1,30),(2,23),(2,23),(2,80),(3,90),(3,100),(3,100))
//转化为RDD
val stuRDD=sc.parallelize(stuList)
val scoreRDD=sc.parallelize(scoreList)
/* //join操作
//遍历
val joinRDD=stuRDD.join(scoreRDD)
for( join2 <- joinRDD ) {
println("===========")
println("id is "+join2._1)
println("name is "+join2._2._1)
println("score is "+join2._2._2)
} */
//cogroup操作
val gourpRDD=stuRDD.cogroup(scoreRDD)
for (group2<- gourpRDD){
println("===========")
println("id is "+group2._1)
println("name is "+group2._2._1)
println("score is "+group2._2._2)
}
//遍历结果
}
}
0 0
- spark中join和group操作
- Spark中常见join操作
- Linq的Group和Join操作
- spark join操作
- 2-2、spark的union和join操作演示
- LINQ学习笔记:Join和Group Join
- java group by 和 join
- c# datatable.select() group by 和 DataTable中进行Distinct、Group by、Join、Create
- Spark Scala DataFram join 操作
- Linq中join & group join & left join 的用法
- DataSetHelper 操作dataset 辅助类(2) DataTable中进行Distinct、Group by、Join、Create
- MapReduce实现基本SQL操作的原理-join和group by,以及Dinstinct
- MapReduce实现基本SQL操作的原理-join和group by,以及Dinstinct
- 【转载】MapReduce实现基本SQL操作的原理-join和group by,以及Dinstinct
- MapReduce实现基本SQL操作的原理-join和group by,以及Dinstinct
- Spark中join,union,textFile
- 在Spark中实现map-side join和reduce-side join
- 在Spark中实现map-side join和reduce-side join
- c++实验7-最大公约数与最小公倍数
- C#变量初始化问题:字段初始值无法引用非静态字段、方法或属性
- 56. Merge Intervals【H】【67】
- javascript中的this
- POJ 2367 Genealogical tree【拓扑排序】
- spark中join和group操作
- LTE Quick Reference, USIM Parameters
- 智能指针2--ScopedPtr
- LeetCode Merge Two Sorted Lists
- java创建文件和目录
- iOS使用自定义字体
- java中equals和==的区别
- 常用算法文章收集
- 欢迎使用CSDN-markdown编辑器