【云星数据---Apache Flink实战系列(精品版)】:Apache Flink批处理API详解与编程实战012--DateSet实用API详解012

来源:互联网 发布:库里15 16赛季数据 编辑:程序博客网 时间:2024/06/08 13:55

DateSet的API详解十二

cross

def cross[O](other: DataSet[O]): CrossDataSet[T, O]Creates a new DataSet by forming the cartesian product of this DataSet and the other DataSet.交叉。拿第一个输入的每一个元素和第二个输入的每一个元素进行交叉操作。

cross实例一:基本tuple

执行程序:

//1.定义两个DataSetval coords1 = benv.fromElements((1,4,7),(2,5,8),(3,6,9))val coords2 = benv.fromElements((10,40,70),(20,50,80),(30,60,90))//2.交叉两个DataSet[Coord]val result1 = coords1.cross(coords2)//3.显示结果result1.collect

执行结果:

res71: Seq[((Int, Int, Int), (Int, Int, Int))] = Buffer(((1,4,7),(10,40,70)), ((2,5,8),(10,40,70)), ((3,6,9),(10,40,70)), ((1,4,7),(20,50,80)), ((2,5,8),(20,50,80)), ((3,6,9),(20,50,80)), ((1,4,7),(30,60,90)), ((2,5,8),(30,60,90)), ((3,6,9),(30,60,90)))

web ui中的执行效果:
这里写图片描述

cross实例二:case class

执行程序:

//1.定义 case classcase class Coord(id: Int, x: Int, y: Int)//2.定义两个DataSet[Coord]val coords1: DataSet[Coord] = benv.fromElements(Coord(1,4,7),Coord(2,5,8),Coord(3,6,9))val coords2: DataSet[Coord] = benv.fromElements(Coord(10,40,70),Coord(20,50,80),Coord(30,60,90))//3.交叉两个DataSet[Coord]val result1 = coords1.cross(coords2)//4.显示结果result1.collect

执行结果:

res69: Seq[(Coord, Coord)] = Buffer((Coord(1,4,7),Coord(10,40,70)), (Coord(2,5,8),Coord(10,40,70)), (Coord(3,6,9),Coord(10,40,70)), (Coord(1,4,7),Coord(20,50,80)), (Coord(2,5,8),Coord(20,50,80)), (Coord(3,6,9),Coord(20,50,80)), (Coord(1,4,7),Coord(30,60,90)), (Coord(2,5,8),Coord(30,60,90)), (Coord(3,6,9),Coord(30,60,90)))

cross实例三:自定义操作

执行程序:

//1.定义 case classcase class Coord(id: Int, x: Int, y: Int)//2.定义两个DataSet[Coord]val coords1: DataSet[Coord] = benv.fromElements(Coord(1,4,7),Coord(2,5,8),Coord(3,6,9))val coords2: DataSet[Coord] = benv.fromElements(Coord(1,4,7),Coord(2,5,8),Coord(3,6,9))//3.交叉两个DataSet[Coord],使用自定义方法val r = coords1.cross(coords2) {  (c1, c2) =>{        val dist =(c1.x + c2.x) +(c1.y + c2.y)        (c1.id, c2.id, dist)    }}//4.显示结果r.collect

执行结果:

res65: Seq[(Int, Int, Int)] = Buffer((1,1,22), (2,1,24), (3,1,26),(1,2,24), (2,2,26), (3,2,28), (1,3,26), (2,3,28), (3,3,30))
阅读全文
0 0
原创粉丝点击