scala fold系列函数及 sparkRDD fold 操作解析

来源:互联网 发布:数据库中column 编辑:程序博客网 时间:2024/05/17 22:41

scala 的fold系列 函数用起来比较方便,这里对比总结一下。

fold

fold 的定义:

deffold[A1 >: A](z: A1)(op: (A1, A1) ⇒ A1): A1

Folds the elements of this traversable or iterator using the specified associative binary operator.

The order in which operations are performed on elements is unspecified and may be nondeterministic.

A1

a type parameter for the binary operator, a supertype of A.

z

a neutral element for the fold operation; may be added to the result an arbitrary number of times, and must not change the result (e.g.,Nil for list concatenation, 0 for addition, or 1 for multiplication.)

op

a binary operator that must be associative

returns

the result of applying fold operator op between all the elements and z


fold 函数的操作顺序是不确定的,而且 A1 是 A 的超类,这是一个比较有用的特性在并发计算的时候,因为,对于fold的中间计算结果,是允许超类之间合并。


foldLeft

foldLeft 的定义:

deffoldLeft[B](z: B)(f: (B, A) ⇒ B): B

Applies a binary operator to a start value and all elements of this sequence, going left to right.

Note: will not terminate for infinite-sized collections.

B

the result type of the binary operator.

z

the start value.

returns

the result of inserting op between consecutive elements of this sequence, going left to right with the start value z on the left:

op(...op(z, x_1), x_2, ..., x_n)

where x1, ..., xn are the elements of this sequence.

foldLeft 中操作函数的顺序是严格从左向右执行,而且从数据类型来看,不适合用在并发情况下。

foldLeft  有一个特殊的符号表示:/:

def /:[B](z: B)(op: (B, A) => B): B = foldLeft(z)(op) 

foldRight

            foldRight 是先将数据reverse,然后调用foldLeft。对应的,foldRight也有一个特殊的符号表示::\

    def foldRight[B](z: B)(op: (A, B) => B): B =          reversed.foldLeft(z)((x, y) => op(y, x))  

def :\[B](z: B)(op: (A, B) => B): B = foldRight(z)(op)  


fold、foldLeft  函数 与reduce 有什么区别?

             reduce 可是用于并行化操作,foldLeft 则不可以,这个对于分布式计算框架非常重要,这也是为什么spark等要保留reduce操作。

但是,在spark中,没有fold函数,那是因为:

fold 需要计算数据是有序的,reduce没有这个要求。fold中的操作,(x op y != y op x),reduce满足交换律。

这个问题在stackoverflow上有一个比较好的解释:

http://stackoverflow.com/questions/25158780/difference-between-reduce-and-foldleft-fold-in-functional-programming-particula?lq=1




0 0
原创粉丝点击