1. spark源码学习分享：reduceByKey

来源：互联网发布：java attach source 编辑：程序博客网时间：2024/05/16 19:41

零、前置（已经了解的看官可以跳过第0章）

spark的rdd支持两种类型的操作，分别是transformation和action操作。简单来说，transformation操作就是通过现有的rdd作一些变换之后得到一个新的rdd（例如map操作）；action操作则是在rdd上作一些计算，然后将结果返回给drvier（例如reduce操作）。具体哪些操作属于transformation，哪些操作属于action可以参照官方文档（http://spark.apache.org/docs/latest/programming-guide.html）。当spark解析到一个transformation类型的方法时，spark并不会立马执行这个transformation操作，而是会将该transformation操作作用在哪个rdd上记录下来，然后等到解析到action类型的方法时才会一并去执行前面的transformation方法。

默认情况下，每次执行到action类型的方法都会把它所依赖的transformation方法重新执行一遍（哪怕两个action方法依赖了同一个transformation方法）。除非你调用cache或者presist方法将产生的中间rdd缓存起来。

本文将从transformation操作开始，以一个job执行的过程为主线来走读源码。这里选择一个比较有代表性的transformation类型方法——reduceByKey。

reduceByKey函数的作用可以参照官方文档，这里不在赘述。

在阅读这部分源码的过程中可以验证二个问题（答案参照文中标粗的部分）：

1、transformation操作究竟会不会立马执行

2、经过transformation操作后生成的rdd和其父rdd的partition个数是什么关系

spark的reduceByKey方法有三种重载形式：

def reduceByKey(partitioner: Partitioner, func: JFunction2[V, V, V]): JavaPairRDD[K, V]

def reduceByKey(func: JFunction2[V, V, V], numPartitions: Int): JavaPairRDD[K, V]

def reduceByKey(func: JFunction2[V, V, V]): JavaPairRDD[K, V]

前两种形式除了允许用户传入聚合函数以外，还允许用户指定partitioner或者指定reduceByKey后生成的rdd的partition个数

  def reduceByKey(func: JFunction2[V, V, V]): JavaPairRDD[K, V] = {    fromRDD(reduceByKey(defaultPartitioner(rdd), func))  }

一、Partitioner的获取

其中，当用户没有指定partitioner以及partition的个数时，spark会调用defaultPartitioner(rdd)函数去获取一个默认的partitioner。defaultPartitioner的源码如下：

  /**   * Choose a partitioner to use for a cogroup-like operation between a number of RDDs.   *   * If any of the RDDs already has a partitioner, choose that one.   *   * Otherwise, we use a default HashPartitioner. For the number of partitions, if   * spark.default.parallelism is set, then we'll use the value from SparkContext   * defaultParallelism, otherwise we'll use the max number of upstream partitions.   *   * Unless spark.default.parallelism is set, the number of partitions will be the   * same as the number of partitions in the largest upstream RDD, as this should   * be least likely to cause out-of-memory errors.   *   * We use two method parameters (rdd, others) to enforce callers passing at least 1 RDD.   */  def defaultPartitioner(rdd: RDD[_], others: RDD[_]*): Partitioner = {    val bySize = (Seq(rdd) ++ others).sortBy(_.partitions.length).reverse    for (r <- bySize if r.partitioner.isDefined && r.partitioner.get.numPartitions > 0) {      return r.partitioner.get    }    if (rdd.context.conf.contains("spark.default.parallelism")) {      new HashPartitioner(rdd.context.defaultParallelism)    } else {      new HashPartitioner(bySize.head.partitions.length)    }  }

该方法的注释中，描述了这个方法的大致逻辑（英语好的看官可以自行看上面的注释）：

该方法允许传入两个rdd，调用该方法时候最少需要传入一个rdd。方法首先会将传入的两个rdd合并成一个数组，然后依据rdd中partition的个数进行降序排序。之后，遍历这个数组，从有partitioner的rdd中，挑选出partition个数最多的rdd，将其partitioner返回。如果传入的rdd都没有partitioner，那么久会返回一个HashPartitioner，其中，如果spark配置了spark.default.parallelism参数，则partition的个数为该参数的值。否则，新生成的rdd中partition的个数取与其依赖的父rdd中partition个数的最大值。

再进一步，我们来看看HashPartitioner（HashPartitioner是Partitioner的一个内部类）的划分规则是怎么样的。先上源码：

/** * A [[org.apache.spark.Partitioner]] that implements hash-based partitioning using * Java's `Object.hashCode`. * * Java arrays have hashCodes that are based on the arrays' identities rather than their contents, * so attempting to partition an RDD[Array[_]] or RDD[(Array[_], _)] using a HashPartitioner will * produce an unexpected or incorrect result. */class HashPartitioner(partitions: Int) extends Partitioner {  require(partitions >= 0, s"Number of partitions ($partitions) cannot be negative.")  def numPartitions: Int = partitions  def getPartition(key: Any): Int = key match {    case null => 0    case _ => Utils.nonNegativeMod(key.hashCode, numPartitions)  }  override def equals(other: Any): Boolean = other match {    case h: HashPartitioner =>      h.numPartitions == numPartitions    case _ =>      false  }  override def hashCode: Int = numPartitions}

HashPartitioner是一个基于Java的Object.hashCode实现（想了解Object.hashCode的实现原理可自行谷歌）的基于hash的partitioner。由于Java arrays的Hash code是基于arrays的标识而不是它的内容，所以如果使用HashPartitioner对RDD[Array[_]]或者RDD[(Array[_],_)]进行partition可能会得到不正确的结果。也就是说，如果rdd中保存的数据类型是arrays，这个时候默认的HashPartitioner是不可用的，用户在调用reduceByKey时需要自行实现一个partitioner，否则方法会抛出异常（具体这块代码在后续会遇到，看官稍安勿躁）。

从源码中可以看出，HashPartitioner的划分规则是根据Utils.nonNegativeMod(key.hashCode, numPartitions)方法而这个方法也很简单粗暴：

  def nonNegativeMod(x: Int, mod: Int): Int = {    val rawMod = x % mod    rawMod + (if (rawMod < 0) mod else 0)  }

就是直接用对象的hashCode对numPartition取模。（所以关键的划分规则还是需要了解下Object.hashCode的实现，这里就不再深入，后续有机会再深入去研究一下）

刚开始说了reduceByKey有三种形式，如果用户传入的是聚合方法和partitioner的个数，这时候会依据传入的partitioner个数直接创建出一个HashPartitioner。这里就不在详细描述。

二、构建子RDD

回到最开始的地方，所有的reduceByKey重载最终都会调用到以下的地方：

  /**   * Merge the values for each key using an associative and commutative reduce function. This will   * also perform the merging locally on each mapper before sending results to a reducer, similarly   * to a "combiner" in MapReduce.   */  def reduceByKey(partitioner: Partitioner, func: (V, V) => V): RDD[(K, V)] = self.withScope {    combineByKeyWithClassTag[V]((v: V) => v, func, func, partitioner)  }

combineByKeyWithClassTag方法是最终构造子rdd的地方（很多的方法，例如groupByKey，最终都会调用该方法对rdd中的数据进行重新的聚合来得到新的rdd，区别在于传入的createCombiner、mergeValue和mergeCombiners参数有所不同）。这里注解中提到了一个小细节，这里会使用（满足交换律和结合律的？）reduce方法对每一个key的值作merge操作，这里在将结果发送给reducer之前，会现在每一个mapper的本地执行merge操作，类似于MapReduce的combiner。这与官方文档中提到的reduceByKey and aggregateByKey create these structures on the map side, and 'ByKey operations generate these on the reduce side.说法相符，具体的将在后续的部分说明（其实是我还没有把那块看完啦，以后补上）。老规矩，先上源码：

  /**   * :: Experimental ::   * Generic function to combine the elements for each key using a custom set of aggregation   * functions. Turns an RDD[(K, V)] into a result of type RDD[(K, C)], for a "combined type" C   * Note that V and C can be different -- for example, one might group an RDD of type   * (Int, Int) into an RDD of type (Int, Seq[Int]). Users provide three functions:   *   *  - `createCombiner`, which turns a V into a C (e.g., creates a one-element list)   *  - `mergeValue`, to merge a V into a C (e.g., adds it to the end of a list)   *  - `mergeCombiners`, to combine two C's into a single one.   *   * In addition, users can control the partitioning of the output RDD, and whether to perform   * map-side aggregation (if a mapper can produce multiple items with the same key).   */  @Experimental  def combineByKeyWithClassTag[C](      createCombiner: V => C,      mergeValue: (C, V) => C,      mergeCombiners: (C, C) => C,      partitioner: Partitioner,      mapSideCombine: Boolean = true,      serializer: Serializer = null)(implicit ct: ClassTag[C]): RDD[(K, C)] = self.withScope {    require(mergeCombiners != null, "mergeCombiners must be defined") // required as of Spark 0.9.0    if (keyClass.isArray) {                                                   ------- 1)      if (mapSideCombine) {        throw new SparkException("Cannot use map-side combining with array keys.")      }      if (partitioner.isInstanceOf[HashPartitioner]) {        throw new SparkException("Default partitioner cannot partition array keys.")      }    }    val aggregator = new Aggregator[K, V, C](                  ------- 2)      self.context.clean(createCombiner),      self.context.clean(mergeValue),      self.context.clean(mergeCombiners))    if (self.partitioner == Some(partitioner)) {                     ------- 3)      self.mapPartitions(iter => {        val context = TaskContext.get()        new InterruptibleIterator(context, aggregator.combineValuesByKey(iter, context))      }, preservesPartitioning = true)    } else {                                                                         -------- 4)      new ShuffledRDD[K, V, C](self, partitioner)        .setSerializer(serializer)        .setAggregator(aggregator)        .setMapSideCombine(mapSideCombine)    }  }

ps: combineByKeyWithClassTag使用了Experimental的注解进行了修饰，这个注解的含义是该方法是一个实验性的方法，在spark的一些小版本里，被这个注解修饰的方法可能会被修改或者移除。

看到代码中的1)部分，大家有没有种亲切的感觉。在讲Partitioner生成的时候，注释中就提到了，如果rdd中保存的数据类型是arrays，这个时候HashPartitioner是不可用的，这里1）部分就是来作相关的验证。此外，如果rdd中保存的数据类型是arrays，在map端作combine操作也是不允许的。

代码的2)部分，会产生一个聚合器，该聚合器中存储了之前传入的用于对rdd作聚合操作的方法。其中，self.context.clean方法用于将闭包（closeure）进行序列化，以便将其发送给其他的task（闭包这个概念会在下篇文章中重点介绍）。

接着重点来看3)和4)部分，这里会先判定当前rdd（也就是self变量引用的rdd）的partitioner跟传入的partitioner一不一样（scala中的==类似于调用方法的equal方法）。如果不一样，4）分支的代码会返回一个shuffledRDD对象，并把要作用在rdd上的相关操作，包括partitioner，serializer，aggregator，mapSideCombine一并保存在ShuffledRDD中返回，则该方法就结束了，并没有实际执行rdd得聚合方法。到这里，跟官网上描述的是相同的（All transformations in Spark are lazy, in that they do not compute their results right away. Instead, they just remember the transformations applied to some base dataset）。而如果partitioner相同，那么走3）分支的代码，这里可以看到一个比较不一样的东西，self作了一个mapPartitions操作，最终返回了一个mapPartitionsRDD。也就是说，当子rdd和它所依赖的父rdd使用了相同的partitioner时，就不需要再进行shuffle操作了。这里其实也很好理解，如果父rdd用的和子rdd相同的partitioner，那么父rdd划分出的分片就已经符合子rdd的需求了，这个时候再作shuffle也就木有意义了嘛。（这段需要再求证一下，官网上暂时没有找到相应的描述，也请了解的看官不吝赐教）

在mapPartition方法中，首先从TaskContext获取了当前运行的task的一些信息（比如task是否已经中断，是否运行成功，GC消耗了多少时间，占用了多少内存等）。然后用InterruptibleIterator修饰了每个partition作聚合操作后得到的Iterator，其中InterruptibleIterator是一个修饰类，被该修饰类修饰的Iterator能够在task被杀掉的时候终止。最后剩下的就是aggregator.combineValuesByKey方法了，接下来结合源码来详细分析一下这个方法：

/** * :: DeveloperApi :: * A set of functions used to aggregate data. * * @param createCombiner function to create the initial value of the aggregation. * @param mergeValue function to merge a new value into the aggregation result. * @param mergeCombiners function to merge outputs from multiple mergeValue function. */@DeveloperApicase class Aggregator[K, V, C] (    createCombiner: V => C,    mergeValue: (C, V) => C,    mergeCombiners: (C, C) => C) {  def combineValuesByKey(      iter: Iterator[_ <: Product2[K, V]],      context: TaskContext): Iterator[(K, C)] = {    val combiners = new ExternalAppendOnlyMap[K, V, C](createCombiner, mergeValue, mergeCombiners)    combiners.insertAll(iter)    updateMetrics(context, combiners)    combiners.iterator  }  def combineCombinersByKey(      iter: Iterator[_ <: Product2[K, C]],      context: TaskContext): Iterator[(K, C)] = {    val combiners = new ExternalAppendOnlyMap[K, C, C](identity, mergeCombiners, mergeCombiners)    combiners.insertAll(iter)    updateMetrics(context, combiners)    combiners.iterator  }  /** Update task metrics after populating the external map. */  private def updateMetrics(context: TaskContext, map: ExternalAppendOnlyMap[_, _, _]): Unit = {    Option(context).foreach { c =>      c.taskMetrics().incMemoryBytesSpilled(map.memoryBytesSpilled)      c.taskMetrics().incDiskBytesSpilled(map.diskBytesSpilled)      c.taskMetrics().incPeakExecutionMemory(map.peakMemoryUsedBytes)    }  }}

这里先看combineCombinersByKey方法，其他的方法用到了再看。combineCombinersByKey方法首先创建一个ExernalAppendOnlyMap对象，然后将当前partition的Iterator放入新创建的ExernalAppendOnlyMap对象，然后调用updataMetrics方法更新TaskContext中的一些信息（消耗的内存空间，磁盘空间等信息），然后将得到的Iterator返回。逻辑很简单，这里比较重要的需要搞清楚两件事：ExternalAppendOnlyMap是个什么？插入Iterator到底怎么插？

首先来看ExternalAppendOnlyMap，先上源码中的官方注释：

/** * :: DeveloperApi :: * An append-only map that spills sorted content to disk when there is insufficient space for it * to grow. * * This map takes two passes over the data: * *   (1) Values are merged into combiners, which are sorted and spilled to disk as necessary *   (2) Combiners are read from disk and merged together * * The setting of the spill threshold faces the following trade-off: If the spill threshold is * too high, the in-memory map may occupy more memory than is available, resulting in OOM. * However, if the spill threshold is too low, we spill frequently and incur unnecessary disk * writes. This may lead to a performance regression compared to the normal case of using the * non-spilling AppendOnlyMap. */@DeveloperApiclass ExternalAppendOnlyMap[K, V, C](

ExternalAppendOnlyMap是spark的一个容器对象，该对象类似于一个只支持append操作的内容有序的map，该对象会依据存储的数据的大小动态调整所占用的内存。ExternalAppendOnlyMap有一个阈值，当添加对象时，如果当前ExternalAppendOnlyMap实例所占的空间小于阈值，则向全局的shuffle memory申请空间，如果大于阈值，则会移除内存存入磁盘当中。这里，如果阈值设置得过大会有OOM的风险，如果阈值设置的过小则会影响容器性能。（翻译得比较渣，英语好的看官直接看官方注释）

接下来看一下insertAll方法，insertAll是ExternalAppendOnlyMap的一个方法，先上源码：

  /**   * Insert the given iterator of keys and values into the map.   *   * When the underlying map needs to grow, check if the global pool of shuffle memory has   * enough room for this to happen. If so, allocate the memory required to grow the map;   * otherwise, spill the in-memory map to disk.   *   * The shuffle memory usage of the first trackMemoryThreshold entries is not tracked.   */  def insertAll(entries: Iterator[Product2[K, V]]): Unit = {    if (currentMap == null) {      throw new IllegalStateException(        "Cannot insert new elements into a map after calling iterator")    }    // An update function for the map that we reuse across entries to avoid allocating    // a new closure each time    var curEntry: Product2[K, V] = null    val update: (Boolean, C) => C = (hadVal, oldVal) => {                                              ------- 2）      if (hadVal) mergeValue(oldVal, curEntry._2) else createCombiner(curEntry._2)    }    while (entries.hasNext) {                                                                                            ------- 1）      curEntry = entries.next()      val estimatedSize = currentMap.estimateSize()      if (estimatedSize > _peakMemoryUsedBytes) {        _peakMemoryUsedBytes = estimatedSize      }      if (maybeSpill(currentMap, estimatedSize)) {        currentMap = new SizeTrackingAppendOnlyMap[K, C]      }      currentMap.changeValue(curEntry._1, update)      addElementsRead()    }  }

该方法的作用各位看官可以看方法上的说明注释。这里分析一下逻辑，首先代码的1）部分，遍历Iterator的所有元素，首先获取currentMap当前所占用的内存大小，如果比_peakMemoryUsedBytes的值大，则更新_peakMemoryUsedBytes的值（_peakMemoryUsedBytes记录了当前map所占用的内存的峰值），然后调用maybeSpill方法判断当前的map是否需要被移出（spill）内存放入磁盘中，如果需要被移出磁盘的话，则重新创建一个SizeTrackingAppendOnlyMap，然后调用2）的方法对rdd作聚合操作（reduceByKey的聚合逻辑很简单，createCombiner方法实际就是直接将入参返回，mergeValue方法也就是用户调用reduceByKey时实现的func方法）。最后调用addElementsRead()方法计数（注解中说这里是用来记录record被读取得次数，用来控制内存移出的频率。。。读？难道是读入内存的意思）

tips:讲到这里，这边我们来额外探讨一下关于partition数量对性能的影响问题。不知道各位看官发现 ExternalAppendOnlyMap会被置换到磁盘中的时候有没有反应过来，涉及到磁盘I/O或网络传输的操作是非常影响性能的。那么这时候我们就要考虑怎么样才能让ExternalAppendOnlyMap不把内存中的记录写到磁盘中去，或者怎么样能尽量少触发写磁盘的操作？这边可以想到的方法有两种：

（1）很显然，如果要插入的record减少，那么就可以减少甚至不触发ExternalAppendOnlyMap的写磁盘操作。那么怎么样让record减少呢？只要我们把partition的数量变多，那么每个partition里record的数量不就减少了嘛。因此，我们可以通过增加partition的个数来提高这一部分的性能（这边要注意的是，并不是增加partition一定会导致record减少，有可能发生数据倾向的问题，这个日后再讨论。此外，partition的数量并不是越多越好，这里涉及到内存和CPU的问题，每一个partition都需要占用额外的空间，另外task的调度本身也有开销）。

（2）第二种能想到的方法就是提高触发的阈值。显然写磁盘触发的条件提高了，那就不会那么容易触发了嘛。一方面，我们可以通过调整配置参数spark.shuffle.spill.numElementsForceSpillThreshold调高numElementsForceSpillThreshold的值；另一方面，我们可以通过调整配置参数spark.shuffle.spill.initialMemoryThreshold提高initialMemoryThreshold的值（也就是myMemoryThreshold的初始值）。同样的，值不能调得太高，否则很容易导致OOM。

参数的调整就全凭大家机器的性能以及经验了。

这里有两块比较重要的代码，maybeSpill方法里判断map是否需要被移出内存的逻辑到底是什么样的？2)中的聚合逻辑是什么样的？

首先来看maybeSpill方法：

  /**   * Spills the current in-memory collection to disk if needed. Attempts to acquire more   * memory before spilling.   *   * @param collection collection to spill to disk   * @param currentMemory estimated size of the collection in bytes   * @return true if `collection` was spilled to disk; false otherwise   */  protected def maybeSpill(collection: C, currentMemory: Long): Boolean = {    var shouldSpill = false    if (elementsRead % 32 == 0 && currentMemory >= myMemoryThreshold) {                                        ------ 1)      // Claim up to double our current memory from the shuffle memory pool      val amountToRequest = 2 * currentMemory - myMemoryThreshold      val granted = acquireMemory(amountToRequest)      myMemoryThreshold += granted      // If we were granted too little memory to grow further (either tryToAcquire returned 0,      // or we already had more memory than myMemoryThreshold), spill the current collection      shouldSpill = currentMemory >= myMemoryThreshold    }    shouldSpill = shouldSpill || _elementsRead > numElementsForceSpillThreshold                                ------ 2)    // Actually spill    if (shouldSpill) {      _spillCount += 1      logSpillage(currentMemory)      spill(collection)      _elementsRead = 0      _memoryBytesSpilled += currentMemory      releaseMemory()    }    shouldSpill  }

1)的if分支需要满足两个条件才会导致map被移出内存存入磁盘中，第一个条件是elementsRead的值必须是32的倍数，前面提到过elementsRead用来控制map被移出的频率，也就是说移出的频率被控制在至少每32次才触发一次（除非当前读入的map元素个数已经超过了必须移出的阈值，也就是2)的逻辑）。另一个条件是currentMemory必须大于等于myMemoryThreshold（可以这么理解，elementsRead控制spill的频率，myMemoryThreshold控制spill的粒度）。接着，如果条件满足的话，会对map的空间进行扩展，扩展的大小为2 * currentMemory - myMemoryThreshold（其实就是扩展为当前map所占内存的两倍，myMemoryThreshold可以理解为你当前已经申请到的内存）。之后调用acquireMemory方法去向taskMemoryManager申请内存。acquireMemory()方法的返回值是申请到的内存大小（注意，并不是你申请多少就给你多少，有可能内存不够，只能给你一点点），更新一下myMemoryThreshold的值，然后再次判断空间是不是够大，如果不够还是要做移出操作。

2)则是判断从上一次spill后到现在读入内存的元素个数是否已经大于给定的阈值numElementsForceSpillThreshold（这个值可以通过spark.shuffle.spill.numElementsForceSpillThreshold参数配置），如果大于也需要作spill操作。然后接下来的操作就很明了了，如果需要移出，那么记录移出的次数，移出的内存大小，重置_elementsRead，记日志，然后移出，清理内存。

最后来看一下changeValue方法，上源码：

  /**   * Set the value for key to updateFunc(hadValue, oldValue), where oldValue will be the old value   * for key, if any, or null otherwise. Returns the newly updated value.   */  def changeValue(key: K, updateFunc: (Boolean, V) => V): V = {    assert(!destroyed, destructionMessage)    val k = key.asInstanceOf[AnyRef]    if (k.eq(null)) {                                                                   ------ 1)      if (!haveNullValue) {        incrementSize()      }      nullValue = updateFunc(haveNullValue, nullValue)      haveNullValue = true      return nullValue    }    var pos = rehash(k.hashCode) & mask    var i = 1    while (true) {      val curKey = data(2 * pos)                                             ------ 2)      if (k.eq(curKey) || k.equals(curKey)) {        val newValue = updateFunc(true, data(2 * pos + 1).asInstanceOf[V])        data(2 * pos + 1) = newValue.asInstanceOf[AnyRef]        return newValue      } else if (curKey.eq(null)) {                                             ------ 3)        val newValue = updateFunc(false, null.asInstanceOf[V])        data(2 * pos) = k        data(2 * pos + 1) = newValue.asInstanceOf[AnyRef]        incrementSize()        return newValue      } else {                                                                           ------ 4)        val delta = i        pos = (pos + delta) & mask        i += 1      }    }    null.asInstanceOf[V] // Never reached but needed to keep compiler happy  }

该方法1)部分用于处理空元素，当遇到key为null时，就返回一个nullValue，如果之前没有遇到key为null的record，则调用incrementSize将表的size加1（也就是说，如果key为null也算作一个元素）。2）位置data是一个Array类型，保存的是已经插入的key，value值（以key1, value1, key2, value2 ……的格式保存），方法首先会通过hash值计算key在data中的相对位置（也就是相同的key会映射到data中的同一个位置），然后从data中取出相应的key（当hash值为n时，则2n的位置保存的是key，2n+1的位置保存的是value），如果取出的key与当前的curKey相等，则将对应位置的val与当前curKey的value值进行merge。相反，如果当前位置取出的key为null，则说明之前没有与curKey相同的key存入，则直接将curKey对应的curVal存入（这里updateFunc对应的就是insertAll里的update方法，如果忘记了update方法的逻辑可以返回前面再看一下）。

最后4)位置，什么时候会走4)的分支呢？就是你hash计算出的值在对应的位置上有key，当时这个key跟curKey不相等的时候，其实也就hash函数产生碰撞的时候（简单的说就是不同的key被映射到同一个hash值了）。如果产生碰撞，这里的处理方法就是简答地继续访问Array的下一个位置，重新走一遍之前的逻辑，直到找到相同的key或者有空位置可以插入为止。

至此，完整的reduceByKey过程就已经结束了。最后，需要注意的是，mapPartitions这个分支最后会将上面提到的aggregator.combineValuesByKey方法保存在mapPartitionsRDD中返回给用，依旧不会立即执行。至于实际执行这些代码的地方，请关注后续的文章。

0 0

1. spark源码学习分享：reduceByKey

零、前置 （已经了解的看官可以跳过第0章）

一、Partitioner的获取

二、构建子RDD

零、前置（已经了解的看官可以跳过第0章）