fold函数和reduce函数的区别(不特指spark)
来源:互联网 发布:怎样下载电子表格软件 编辑:程序博客网 时间:2024/05/24 03:23
In a fold over a collection, the accumulator type may be different than the type of the collection, and a zero element is usually given. In a reduce, you don't give a zero element and the accumulator type is the same type as is in the collection. A reduce is a special case of a fold but not vice versa. Type signatures are as follows:
The function of the fold will usually not be commutative, and order of applications matters, so you have to differentiate between left-folds and right-folds. The example above is one of a left fold, because:
For a contrast:
In the context of addition, the difference between left and right folds doesn't matter because addition is commutative and you get the same answer either way. That doesn't apply in general.
With a reduce, the conceptual assumption is that the operation is strictly associative, and often commutative. This allows the reduce to parallelized and even distributed (as in "map reduce") while a fold (which makes no such associations) is intended to be serial. No zero element is given, and it's an error to reduce on an empty collection. The type signature of reduce is this:
Assuming associativity of the operator, you can implement
It's much harder to parallelize a general fold as if it were a reduce. Mathematically speaking, you can do it if you have first-class functions, by treating your
In code, it looks like this:
- foldLeft :: (a -> b -> a) -> a -> [b] -> a
- foldLeft (λx y. x + y) 0 [1, 2, 3] = 6
- foldLeft (λx y. x * y) 1 [2, 3, 5] = 30
- foldLeft (λx _. x + 1) 0 ["cat", "dog"] = 2
The function of the fold will usually not be commutative, and order of applications matters, so you have to differentiate between left-folds and right-folds. The example above is one of a left fold, because:
- -- (+) is shorthand for (λx y. x + y)
- foldLeft (+) 0 [1, 2, 3] = ((0 + 1) + 2) + 3
For a contrast:
- foldRight (+) 0 [1, 2, 3] = 1 + (2 + (3 + 0))
In the context of addition, the difference between left and right folds doesn't matter because addition is commutative and you get the same answer either way. That doesn't apply in general.
With a reduce, the conceptual assumption is that the operation is strictly associative, and often commutative. This allows the reduce to parallelized and even distributed (as in "map reduce") while a fold (which makes no such associations) is intended to be serial. No zero element is given, and it's an error to reduce on an empty collection. The type signature of reduce is this:
- reduce :: (a -> a -> a) -> [a] -> a
Assuming associativity of the operator, you can implement
reduce
in terms of
- foldLeft[code] like so:
- [code]
- reduce f [] = error
- reduce f (head:tail) = foldLeft f head tail
It's much harder to parallelize a general fold as if it were a reduce. Mathematically speaking, you can do it if you have first-class functions, by treating your
b
's in the collection as a-> a[code](thatis, transformations of the accumulator) through the injection [code]g b=λ a. f a b
where f
is the folding function, and using function composition (which is associative, although not commutative) as your reducing function. Then, you are building up a giant deferred computation of type a-> a
that is finally applied to the given zero-value of type a
. Whether that will be efficient is an open question, but it is mathematically sound. In code, it looks like this:
- compose :: (a -> a) -> (a -> a) -> (a -> a)
- compose f g = (λx. f (g x))
- id :: (a -> a)
- id x = x
- foldLeft :: (a -> b -> a) -> a -> [b] -> a
- foldLeft f z coll =
- (g coll) z
- where g [] = id
- g _ = reduce compose (map (λb . (λa . f a b)) coll)
阅读全文
0 0
- fold函数和reduce函数的区别(不特指spark)
- Java FP(Java8): Java中函数式编程的Map和Fold(Reduce)
- Java FP: Java中函数式编程的Map和Fold(Reduce)
- SPARK里的reduce(),fold(),以及aggregate()
- python中map()和reduce()函数的区别
- Spark算子[11]:reduce、aggregate、fold 详解
- spark RDD算子(九)之基本的Action操作 first, take, collect, count, countByValue, reduce, aggregate, fold,top
- python的reduce函数和map函数
- Spark中parallelize函数和makeRDD函数的区别
- Spark中parallelize函数和makeRDD函数的区别
- Spark中parallelize函数和makeRDD函数的区别
- 函数式编程里面的fold
- Map 和 Reduce函数
- Xcode中的Group和fold的区别
- Map函数和Reduce函数
- map()函数和reduce()函数
- python的map和reduce函数
- Python 里面的reduce函数和lambda
- 通过Spark Shell测试Spark集群以cache机制
- WPF在Canvas中绘图实现折线统计图
- 解析异步消息处理机制
- android 5.1 设备上使用 usb2com 时 【 tcgetattr() failed 】 错误问题解决
- 反射
- fold函数和reduce函数的区别(不特指spark)
- css @media rem+百分比布局 响应式布局之媒体查询
- Mina框架实现客户端与服务端实相互发送消息
- 中港物流-中港运输
- Lectra.DesignConcept.3D.v3R1c.Multilanguage-ISO 2CD(软装饰设计软件)
- POJ 3579 Median
- hdu3887(dfs序)
- 利用google hacking发现某大学存在目录结构暴露漏洞
- spring+springmvc实现websoket1