Spark RDD算子【二】coalesce 和 repartition
来源:互联网 发布:淘宝用别人信用卡支付 编辑:程序博客网 时间:2024/06/05 03:21
1.coalesce 和 repartition介绍
它们二个都是在创建好分区之后可以修改分区数量的,使用上有点区别
2.例子
2.1 repartition
scala> val rdd1 = sc.parallelize(List(1,2,3,4,5,6,7,8,9), 2)rdd1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[32] at parallelize at <console>:21scala> rdd1.partitions.lengthres18: Int = 2scala> val rdd2=rdd1.repartition(3)rdd2: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[36] at repartition at <console>:23scala> rdd2.partitions.lengthres19: Int = 3
2.2 coalesce
scala> val rdd1 = sc.parallelize(List(1,2,3,4,5,6,7,8,9), 2)rdd1: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[47] at parallelize at <console>:21scala> rdd1.partitions.lengthres24: Int = 2scala> val rdd2=rdd1.coalesce(3,true) //true表示是否在shuffle阶段rdd2: org.apache.spark.rdd.RDD[Int] = MapPartitionsRDD[51] at coalesce at <console>:23scala> rdd2.partitions.lengthres25: Int = 3scala>
阅读全文
0 0
- Spark RDD算子【二】coalesce 和 repartition
- Spark Rdd coalesce()方法和repartition()
- Spark Rdd coalesce()方法和repartition()方法
- Spark RDD coalesce()方法和repartition()方法
- spark算子(repartition和coalesce)
- Spark算子:RDD基本转换操作(2)–coalesce、repartition
- Spark算子:RDD基本转换操作(2)–coalesce、repartition
- Spark算子:RDD基本转换操作(2)–coalesce、repartition
- Spark算子:RDD基本转换操作(2)–coalesce、repartition
- Spark编程之基本的RDD算子coalesce, repartition, checkpoint
- Spark算子:RDD基本转换操作(2)–coalesce、repartition
- Spark算子[02]:coalesce,repartition
- spark coalesce和repartition区别
- 3.2 Spark RDD 基本转换操作2-分区:coalesce、repartition
- Spark中repartition和coalesce的用法
- Spark中repartition和coalesce的用法
- spark coalesce和repartition的区别
- Spark中repartition和coalesce的用法
- XShell--SSH端口转发
- 【网络编程】MarioTCP
- 面试总结之time_wait状态产生的原因,危害,如何避免
- 写一个编程题把前面的格式转换成后面的格式。 ttt5yjd-jdd4yh tttyyyyyjd@jddyyyyh Hsdkf4sd-fsd3tr Hsdkfssssd@fsd3tttr Sf5sd-d
- 浅议HASH 表
- Spark RDD算子【二】coalesce 和 repartition
- kNN和MDS降维
- jQuery中的DOM操作
- UnityShader——挺进体积光
- ShiroFilterFactoryBean过滤器数量限制问题
- poj1159(dp)
- 错误、调试和测试——Python学习笔记09
- 竖直ViewPager
- 函数练习