aggregate vs treeAggregate
来源:互联网 发布:淘宝搜索你会感谢我的 编辑:程序博客网 时间:2024/05/22 13:40
aggregate
aggregate[U: ClassTag](zeroValue: U)(seqOp: (U, T) => U, combOp: (U, U) => U)
aggregate函数将每个分区进行seqOp,且从zeroValue开始遍历分区里的所有元素.然后用combOp,从zeroValue开始遍历所有分区的结果.
注意:每个partition的seqOp只应用一次zeroValue,最后的combOp也应用一次zeroValue.
例子:
scala> def seq(a:Int,b:Int):Int={ | println("seq:"+a+":"+b) | math.min(a,b)}seq: (a: Int, b: Int)Intscala> def comb(a:Int,b:Int):Int={ | println("comb:"+a+":"+b) | a+b}comb: (a: Int, b: Int)Intval z =sc.parallelize(List(1,2,4,5,8,9),3)scala> z.aggregate(3)(seq,comb)seq:3:4seq:3:1seq:1:2seq:3:8seq:3:5seq:3:9comb:3:1comb:4:3comb:7:3res10: Int = 10
treeAggregate
treeAggregate[U: ClassTag](zeroValue: U)( seqOp: (U, T) => U, combOp: (U, U) => U, depth: Int = 2)
与aggregate不同的地方是:在每个分区,会做两次或者多次combOp,避免将所有局部的值传给driver端.另外,经过测验初始值zeroValue不会参与combOp.
例子:
scala> z.treeAggregate(3)(seq,comb)seq:3:4seq:3:5seq:3:1seq:1:2seq:3:8seq:3:9comb:3:3comb:6:1res12: Int = 7
对比图:
注释:
Aggregate
- each executor holds a portion of learning set
- broadcast model to excutors
- collect results to driver
TreeAggregate
- simple heuristic to add level
- perform partial aggregation by shipping results to other executors(by repartitioning)
1 0
- aggregate vs treeAggregate
- treeAggregate和Aggregate的区别
- treeAggregate、treeReduce
- sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate
- RDD.treeAggregate 的用法
- Enumerable.Aggregate
- SORT AGGREGATE
- 聚合体Aggregate
- Scala aggregate
- mongo aggregate
- Aggregate方法
- mongo-aggregate
- 聚合 aggregate
- 【Spark Java API】Action(5)—treeAggregate、treeReduce
- C++中的Aggregate
- C++中的Aggregate
- Aggregate Function Data Type
- LINQ之Aggregate
- 空间插值概述
- 【u109】数字生成游戏(gen)
- MySql常用函数数学函数、加密函数等
- 快速幂算法
- 219.leetcode Contains Duplicate II(easy)[数组 滑动窗口]
- aggregate vs treeAggregate
- ASoC驱动开发 之 Codec芯片ALC5677 驱动代码分析
- (三)、ZooKeeper 命令操作
- Android S端双向配置证书
- 打造简单实用的Thinkphp分页样式(Bootstrap版本)
- 使用Damerau-Levenshtein自动机实现字符串模糊查询
- JEECMSv6源码导入eclipse步骤图文详解
- 闲聊javaweb之servlet
- Javacript 对元素赋值的处理