Spark修炼之道(进阶篇)——Spark入门到精通:第七节 Spark运行原理

来源:互联网 发布:node sass 安装 编辑:程序博客网 时间:2024/05/17 18:28

本节主要内容

  1. Spark运行方式
  2. Spark运行原理解析

本节内容及部分图片来自: 
http://blog.csdn.net/book_mmicky/article/details/25714419 
http://blog.csdn.net/yirenboy/article/details/47441465 
这两篇文件对Spark的运行架构原理进行了比较深入的讲解,写得非常好,建议大家认真看一下,在此向作者致敬!

1. Spark运行方式

用户编写完Spark应用程序之后,需要将应用程序提交到集群中运行,提交时使用脚本spark-submit进行,spark-submit可以带多种参数,参数选项可以通过下列命令查看

<code class="hljs ruby has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root<span class="hljs-variable" style="color: rgb(102, 0, 102); box-sizing: border-box;">@sparkmaster</span><span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">:/hadoopLearning/spark-</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.5</span>.<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">0</span>-bin-hadoop2.<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">4</span>/bin<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># ./spark-submit --help</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li></ul>

这里写图片描述

可以看到,spark-submit提交参数如下:

<code class="hljs haml has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">./bin/spark-submit \  -<span class="ruby" style="box-sizing: border-box;">-<span class="hljs-class" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">class</span> <span class="hljs-inheritance" style="box-sizing: border-box;"><<span class="hljs-parent" style="box-sizing: border-box;">main</span></span>-<span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">class</span>></span></span>  -<span class="ruby" style="box-sizing: border-box;">-master <master-url> \</span>  -<span class="ruby" style="box-sizing: border-box;">-deploy-mode <deploy-mode> \</span>  -<span class="ruby" style="box-sizing: border-box;">-conf <key>=<value> \</span>  ... # other options  <application-jar> \  [application-arguments]</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>

下面介绍几种常用Spark应用程序提交方式:

(1)本地运行方式 –master local

<code class="hljs vhdl has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">//<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">--master local,本地运行方式。读取文件可以采用本地文件系统也可采用HDFS,这里给出的例子是采用本地文件系统</span>//从本地文件系统读取文件<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">file</span>:/hadoopLearning/spark-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.5</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.0</span>-bin-hadoop2<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.4</span>/README.md//生成的结果也保存到本地文件系统:<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">file</span>:/SparkWordCountResultroot@sparkmaster:/hadoopLearning/spark-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.5</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.0</span>-bin-hadoop2<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.4</span>/bin# ./spark-submit <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">--master local </span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">--class SparkWordCount --executor-memory 1g </span>/root/IdeaProjects/SparkWordCount/<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">out</span>/artifacts/SparkWordCount_jar/SparkWordCount.jar <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">file</span>:/hadoopLearning/spark-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.5</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.0</span>-bin-hadoop2<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.4</span>/README.md <span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">file</span>:/SparkWordCountResult</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li></ul>

这里写图片描述

(2)Standalone运行方式 –master spark://sparkmaster:7077

采用Spark自带的资源管理器进行集群资源管理

<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">//standalone运行,指定<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">--master spark://sparkmaster:7077</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//采用本地文件系统,也可采用HDFS</span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">//没有指定deploy-mode,默认为client deploy mode</span>root@sparkmaster:/hadoopLearning/spark-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.5</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.0</span>-bin-hadoop2<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.4</span>/bin<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># </span>./spark-submit <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">--master spark://sparkmaster:7077 </span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">--class SparkWordCount --executor-memory 1g </span>/root/IdeaProjects/SparkWordCount/out/artifacts/SparkWordCount_jar/SparkWordCount.jar <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">file</span>:/hadoopLearning/spark-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.5</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.0</span>-bin-hadoop2<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.4</span>/README.md <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">file</span>:/SparkWordCountResult2</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li><li style="box-sizing: border-box; padding: 0px 5px;">8</li><li style="box-sizing: border-box; padding: 0px 5px;">9</li><li style="box-sizing: border-box; padding: 0px 5px;">10</li></ul>

这里写图片描述 
图片来源:http://blog.csdn.net/book_mmicky/article/details/25714419,

在执行过程中,可以通过http://192.168.1.103:4040查看任务状态,192.168.1.103为sparkmaster IP地址: 
这里写图片描述

也可以指定为cluster deploy mode,例如:

<code class="hljs haml has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root@sparkmaster:/hadoopLearning/spark-1.5.0-bin-hadoop2.4/bin# ./spark-submit -<span class="ruby" style="box-sizing: border-box;">-master <span class="hljs-symbol" style="color: rgb(0, 102, 102); box-sizing: border-box;">spark:</span>/<span class="hljs-regexp" style="color: rgb(0, 136, 0); box-sizing: border-box;">/sparkmaster:7077 </span></span>-<span class="ruby" style="box-sizing: border-box;">-deploy-mode cluster</span>-<span class="ruby" style="box-sizing: border-box;">-supervise --<span class="hljs-class" style="box-sizing: border-box;"><span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">class</span> <span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">SparkWordCount</span> --<span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">executor</span>-<span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">memory</span> 1<span class="hljs-title" style="box-sizing: border-box; color: rgb(102, 0, 102);">g</span> </span></span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">/root/IdeaProjects/SparkWordCount/out/artifacts/SparkWordCount_jar/SparkWordCount.jar </span> file:/hadoopLearning/spark-1.5.0-bin-hadoop2.4/README.md  file:/SparkWordCountResult3</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li><li style="box-sizing: border-box; padding: 0px 5px;">7</li></ul>

与 clinet deploy mode不同的是 cluster deploy mode中的SparkContext在集群内部创建。

(3)Yarn运行方式

采用Yarn作为底层资源管理器

<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">//Yarn Clusterroot@sparkmaster:/hadoopLearning/spark-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.5</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.0</span>-bin-hadoop2<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.4</span>/bin<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;"># </span>./spark-submit --master yarn-cluster --class org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.examples</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.SparkPi</span> --executor-memory <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>g /root/IdeaProjects/SparkWordCount/<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">out</span>/artifacts/SparkWordCount_jar/SparkWordCount<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.jar</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>

这里写图片描述 
图片来源:http://blog.csdn.net/yirenboy/article/details/47441465

<code class="hljs avrasm has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">//Yarn Clientroot@sparkmaster:/hadoopLearning/spark-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.5</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.0</span>-bin-hadoop2<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.4</span>/bin<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;"># </span>./spark-submit --master yarn-client  --class org<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.apache</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.spark</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.examples</span><span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.SparkPi</span> --executor-memory <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>g /root/IdeaProjects/SparkWordCount/<span class="hljs-keyword" style="color: rgb(0, 0, 136); box-sizing: border-box;">out</span>/artifacts/SparkWordCount_jar/SparkWordCount<span class="hljs-preprocessor" style="color: rgb(68, 68, 68); box-sizing: border-box;">.jar</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li><li style="box-sizing: border-box; padding: 0px 5px;">5</li><li style="box-sizing: border-box; padding: 0px 5px;">6</li></ul>

这里写图片描述 
图片来源:http://blog.csdn.net/yirenboy/article/details/47441465

//Yarn Client运行效果图 
这里写图片描述

2. Spark运行原理解析

(1)窄依赖与宽依赖

在前面讲的Spark编程模型当中,我们对RDD中的常用transformation与action 函数进行了讲解,我们提到RDD经过transformation操作后会生成新的RDD,前一个RDD与tranformation操作后的RDD构成了lineage关系,也即后一个RDD与前一个RDD存在一定的依赖关系,根据tranformation操作后RDD与父RDD中的分区对应关系,可以将依赖分为两种:宽依赖(wide dependency)和窄依赖(narrow dependency),如下图所示:

这里写图片描述

图中的实线空心矩形代表一个RDD,实线空心矩形中的带阴影的小矩形表示分区(partition)。从上图中可以看到, map,filter、union等transformation操作后的RDD仅依赖于父RDD的固定分区,它们是窄依赖的;而groupByKey后的RDD的分区与父RDD所有的分区都有依赖关系,此时它们就是宽依赖的。join操作存在两种情况,如果分区仅仅依赖于父RDD的某一分区,则是窄依赖的,否则就是宽依赖。

(2)Spark job运行原理

spark-submit提交Spark应用程序后,其执行流程如下: 
1 创建SparkContext对象,然后SparkContext会向Clutser Manager(集群资源管理器),例如Yarn、Standalone、Mesos等申请资源 
2 资源管理器在worker node上创建executor并分配资源(CPU、内存等),后期excutor会定时向资源管理器发送心跳信息 
3 SparkContext启动DAGScheduler,将提交的作业(job)转换成若干Stage,各Stage构成DAG(Directed Acyclic Graph有向无环图),各个Stage包含若干相task,这些task的集合被称为TaskSet 
4 TaskSet发送给TaskSet Scheduler,TaskSet Scheduler将Task发送给对应的Executor,同时SparkContext将应用程序代码发送到Executor,从而启动任务的执行 
5 Executor执行Task,完成后释放相应的资源。

下图给出了DAGScheduler的工作原理:

这里写图片描述

当RDDG触发相应的action操作(如collect)后,DAGScheduler会根据程序中的transformation类型构造相应的DAG并生成相应的stage,所有窄依赖构成一个stage,而单个宽依赖会生成相应的stage。上图中的黑色矩形表示这些RDD被缓存过,因此上图中的只需要计算stage2、 stage3即可

前面我们提到各Stage由若干个task组成,这些task构建taskset,最终交给Task Scheduler进行调度,最终将task发送到executor上执行,如下图所示 。

这里写图片描述

(3)spark-Shell jobs调度演示

在spark-master上,启动spark-shell

<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">root@sparkmaster:/hadoopLearning/spark-<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1.5</span><span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.0</span>-bin-hadoop2<span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">.4</span>/bin<span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;"># </span>./spark-<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">shell</span> <span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">--master spark://sparkmaster:7077 </span><span class="hljs-comment" style="color: rgb(136, 0, 0); box-sizing: border-box;">--executor-memory 1g</span></code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li></ul>

打开浏览器,输入: http://sparkmaster:4040/,并点击executors,可以查看集群中所有的executor,如下图所示 
这里写图片描述 
从图中可以看到sparkmaster除了是一个executor之外,它还是一个driver即(standalone clinet模式)

<code class="hljs livecodeserver has-numbering" style="display: block; padding: 0px; color: inherit; box-sizing: border-box; font-family: 'Source Code Pro', monospace;font-size:undefined; white-space: pre; border-radius: 0px; word-wrap: normal; background: transparent;">val rdd1= sc.textFile(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">"/README.md"</span>).flatMap(<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">line</span> => <span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">line</span>.<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">split</span>(<span class="hljs-string" style="color: rgb(0, 136, 0); box-sizing: border-box;">" "</span>)).map(<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">word</span> => (<span class="hljs-built_in" style="color: rgb(102, 0, 102); box-sizing: border-box;">word</span>, <span class="hljs-number" style="color: rgb(0, 102, 102); box-sizing: border-box;">1</span>)).groupByKey().reduceByKey((<span class="hljs-operator" style="box-sizing: border-box;">a</span>,b)=><span class="hljs-operator" style="box-sizing: border-box;">a</span>+b)</code><ul class="pre-numbering" style="box-sizing: border-box; position: absolute; width: 50px; top: 0px; left: 0px; margin: 0px; padding: 6px 0px 40px; border-right-width: 1px; border-right-style: solid; border-right-color: rgb(221, 221, 221); list-style: none; text-align: right; background-color: rgb(238, 238, 238);"><li style="box-sizing: border-box; padding: 0px 5px;">1</li><li style="box-sizing: border-box; padding: 0px 5px;">2</li><li style="box-sizing: border-box; padding: 0px 5px;">3</li><li style="box-sizing: border-box; padding: 0px 5px;">4</li></ul>

这里写图片描述

点击stage 1 对应的的map,查看该stage中对应的task信息及在对应的executor上的执行情况: 
这里写图片描述

转载: http://blog.csdn.net/lovehuangjiaju/article/details/48634607

0 0
原创粉丝点击