第33课： Spark Executor内幕彻底解密：Executor工作原理图、ExecutorBackend注册源码解密、Executor实例化内幕、Executor具体工作内幕

来源：互联网发布：linux怎么启动apache 编辑：程序博客网时间：2024/05/21 07:11

本节讲解Executor工作原理图、ExecutorBackend注册源码解密、Executor实例化内幕、Executor具体工作内幕。

Master让Worker启动，启动了一个Executor所在的进程，在Standalone模式中，Executor所在的进程是CoarseGrainedExecutorBackend。

l Master侧：Master发指令给Worker启动Executor。

l Worker侧：Worker接收到Master发过来的指令通过ExecutorRunner启动另外一个进程来运行Executor。这里是指启动另外一个进程来启动Executor，而不是直接启动Executor。Master向Worker发送指令，Worker为什么启动另外一个进程，在另外一个进程中注册给Driver，然后启动Executor？因为Worker本身是管理机器上的资源的，机器上资源变动的时候要汇报给Master。Worker不是用来计算的，不能在Woker中进行计算；Spark 集群中有很多应用程序，需要很多Executor，如果不是给每一个Executor启动一个进程，而是所有的Executor都在Worker里面，如果一个程序崩溃将导致其它的程序也崩溃。

l 启动CoarseGrainedExecutorBackend，CoarseGrainedExecutorBackend是Executor所在的进程。CoarseGrainedExecutorBackend启动的时候，需向Driver侧注册。通过发送RegisterExecutor向Driver注册，注册的内容是RegisterExecutor：

CoarseGrainedExecutorBackend.scala的onstart方法源码：

1. override def onStart() {

2. logInfo("Connecting to driver: "+ driverUrl)

3. rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap{ ref =>

4. // This is a very fast action so we canuse "ThreadUtils.sameThread"

5. driver = Some(ref)

6. ref.ask[Boolean](RegisterExecutor(executorId,self, hostname, cores, extractLogUrls))

7. }(ThreadUtils.sameThread).onComplete {

8. // This is a very fast action so we canuse "ThreadUtils.sameThread"

9. case Success(msg) =>

10. // Always receive`true`. Just ignore it

11. case Failure(e) =>

12. exitExecutor(1,s"Cannot register with driver: $driverUrl", e, notifyDriver = false)

13. }(ThreadUtils.sameThread)

14. }

其中RegisterExecutor是一个case class，源码为：

1. case class RegisterExecutor(

2. executorId: String,

3. executorRef: RpcEndpointRef,

4. hostname: String,

5. cores: Int,

6. logUrls: Map[String, String])

7. extends CoarseGrainedClusterMessage

CoarseGrainedExecutorBackend启动的时候，向Driver发送RegisterExecutor消息进行注册；Driver收到RegisterExecutor消息，在Executor注册成功后会返回消息RegisteredExecutor给CoarseGrainedExecutorBackend。这里注册的Executor和真正工作的Executor没有任何关系，其实注册的是RegisterExecutorBackend，可以将RegisteredExecutor名字理解为RegisterExecutorBackend。

需要特别注意是在CoarseGrainedExecutorBackend启动时向Driver注册Executor其实质是注册ExecutorBackend实例，和Executor实例之间没有直接的关系！

l CoarseGrainedExecutorBackend是Executor运行所在的进程名称，CoarseGrainedExecutorBackend本身不会完成任务的计算；

l Executor才是正在处理Task的对象。Executor内部是通过线程池的方式来完成Task的计算的；Executor对象运行于CoarseGrainedExecutorBackend进程。

l CoarseGrainedExecutorBackend和Executor是一一对应的。

l CoarseGrainedExecutorBackend是一个消息通信体（其具体实现了ThreadSafeRPCEndpoint）,可以发送信息给Driver并可以接受Driver中发过来的指令，例如启动Task等。

CoarseGrainedExecutorBackend继承至ThreadSafeRpcEndpoint，CoarseGrainedExecutorBackend是一个消息通信体，可以收消息，也可以发消息。源码如下：

1. private[spark] classCoarseGrainedExecutorBackend(

2. override val rpcEnv: RpcEnv,

3. driverUrl: String,

4. executorId: String,

5. hostname: String,

6. cores: Int,

7. userClassPath: Seq[URL],

8. env: SparkEnv)

9. extends ThreadSafeRpcEndpoint withExecutorBackend with Logging {

CoarseGrainedExecutorBackend发消息给我们的Driver，Driver在StandaloneSchedulerBackend里面（spark 2.0中已将SparkDeploySchedulerBackend更名为StandaloneSchedulerBackend），StandaloneSchedulerBackend继承至CoarseGrainedSchedulerBackend， start启动的时候启动StandaloneAppClient，StandaloneAppClient（在Spark 2.0中将AppClient更名为StandaloneAppClient），代表应用程序本身。。

StandaloneSchedulerBackend.scala的start方法源码如下：

1. overridedef start() {

2. super.start()

3. launcherBackend.connect()

5. // The endpoint for executors to talk to us

6. val driverUrl = RpcEndpointAddress(

7. sc.conf.get("spark.driver.host"),

8. sc.conf.get("spark.driver.port").toInt,

9. CoarseGrainedSchedulerBackend.ENDPOINT_NAME).toString

10. val args = Seq(

11. "--driver-url", driverUrl,

12. "--executor-id","{{EXECUTOR_ID}}",

13. "--hostname","{{HOSTNAME}}",

14. "--cores","{{CORES}}",

15. "--app-id","{{APP_ID}}",

16. "--worker-url","{{WORKER_URL}}")

17. val extraJavaOpts =sc.conf.getOption("spark.executor.extraJavaOptions")

18. .map(Utils.splitCommandString).getOrElse(Seq.empty)

19. val classPathEntries =sc.conf.getOption("spark.executor.extraClassPath")

20. .map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)

21. val libraryPathEntries =sc.conf.getOption("spark.executor.extraLibraryPath")

22. .map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)

23.

24. // When testing, exposethe parent class path to the child. This is processed by

25. //compute-classpath.{cmd,sh} and makes all needed jars available to childprocesses

26. // when the assembly isbuilt with the "*-provided" profiles enabled.

27. val testingClassPath =

28. if(sys.props.contains("spark.testing")) {

29. sys.props("java.class.path").split(java.io.File.pathSeparator).toSeq

30. } else {

31. Nil

32. }

33.

34. // Start executors with afew necessary configs for registering with the scheduler

35. val sparkJavaOpts =Utils.sparkJavaOpts(conf, SparkConf.isExecutorStartupConf)

36. val javaOpts =sparkJavaOpts ++ extraJavaOpts

37. val command =Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",

38. args, sc.executorEnvs,classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)

39. val appUIAddress =sc.ui.map(_.appUIAddress).getOrElse("")

40. val coresPerExecutor =conf.getOption("spark.executor.cores").map(_.toInt)

41. // If we're using dynamicallocation, set our initial executor limit to 0 for now.

42. //ExecutorAllocationManager will send the real initial limit to the Master later.

43. val initialExecutorLimit =

44. if(Utils.isDynamicAllocationEnabled(conf)) {

45. Some(0)

46. } else {

47. None

48. }

49. val appDesc = newApplicationDescription(sc.appName, maxCores, sc.executorMemory, command,

50. appUIAddress,sc.eventLogDir, sc.eventLogCodec, coresPerExecutor, initialExecutorLimit)

51. client = newStandaloneAppClient(sc.env.rpcEnv, masters, appDesc, this, conf)

52. client.start()

53. launcherBackend.setState(SparkAppHandle.State.SUBMITTED)

54. waitForRegistration()

55. launcherBackend.setState(SparkAppHandle.State.RUNNING)

56. }

看一下StandaloneAppClient的源码：

1. private[spark]class StandaloneAppClient(

2. rpcEnv: RpcEnv,

3. masterUrls: Array[String],

4. appDescription: ApplicationDescription,

5. listener: StandaloneAppClientListener,

6. conf: SparkConf)

7. extends Logging {

8. ……

9. private classClientEndpoint(override val rpcEnv: RpcEnv) extends ThreadSafeRpcEndpoint

10. with Logging {

11. ……

在Driver进程有两个至关重要的Endpoint：

l ClientEndpoint：主要负责向Master注册当前的程序，是AppClient的内部成员；

l DriverEndpoint：这是整个程序运行时候的驱动器，是CoarseGrainedExecutorBackend的内部成员；

CoarseGrainedSchedulerBackend的DriverEndpoint

1. class DriverEndpoint(override val rpcEnv:RpcEnv, sparkProperties: Seq[(String, String)])

2. extends ThreadSafeRpcEndpoint with Logging{

在DriverEndpoint会接收到RegisterExecutor消息，并完成在Driver上的注册。

CoarseGrainedSchedulerBackend的RegisterExecutor源码如下：

1. case RegisterExecutor(executorId,executorRef, hostname, cores, logUrls) =>

2. if(executorDataMap.contains(executorId)) {

3. executorRef.send(RegisterExecutorFailed("Duplicateexecutor ID: " + executorId))

4. context.reply(true)

5. } else {

6. // If the executor's rpc env is notlistening for incoming connections, `hostPort`

7. // will be null, and the clientconnection should be used to contact the executor.

8. val executorAddress = if(executorRef.address != null) {

9. executorRef.address

10. } else {

11. context.senderAddress

12. }

13. logInfo(s"Registered executor$executorRef ($executorAddress) with ID $executorId")

14. addressToExecutorId(executorAddress)= executorId

15. totalCoreCount.addAndGet(cores)

16. totalRegisteredExecutors.addAndGet(1)

17. val data = newExecutorData(executorRef, executorRef.address, hostname,

18. cores, cores,logUrls)

19. // This must besynchronized because variables mutated

20. // in this block areread when requesting executors

21. CoarseGrainedSchedulerBackend.this.synchronized{

22. executorDataMap.put(executorId,data)

23. if (currentExecutorIdCounter <executorId.toInt) {

24. currentExecutorIdCounter =executorId.toInt

25. }

26. if(numPendingExecutors > 0) {

27. numPendingExecutors -= 1

28. logDebug(s"Decrementednumber of pending executors ($numPendingExecutors left)")

29. }

30. }

31. executorRef.send(RegisteredExecutor)

32. // Note: some testsexpect the reply to come after we put the executor in the map

33. context.reply(true)

34. listenerBus.post(

35. SparkListenerExecutorAdded(System.currentTimeMillis(),executorId, data))

36. makeOffers()

37. }

RegisterExecutor其中有个数据结构executorDataMap，是Key-Value的方式。

1. privateval executorDataMap = new HashMap[String, ExecutorData]

ExecutorData中的executorEndpoint是RpcEndpointRef ，ExecutorData的源码如下：

1. private[cluster] class ExecutorData(

2. val executorEndpoint: RpcEndpointRef,

3. val executorAddress: RpcAddress,

4. override val executorHost: String,

5. var freeCores: Int,

6. override val totalCores: Int,

7. override val logUrlMap: Map[String, String]

8. ) extendsExecutorInfo(executorHost, totalCores, logUrlMap)

看一下CoarseGrainedExecutorBackend.scala的RegisteredExecutor源码：

1. override def receive: PartialFunction[Any,Unit] = {

2. case RegisteredExecutor =>

3. logInfo("Successfully registeredwith driver")

4. try {

5. executor = new Executor(executorId,hostname, env, userClassPath, isLocal = false)

6. } catch {

7. case NonFatal(e) =>

8. exitExecutor(1, "Unable tocreate executor due to " + e.getMessage, e)

CoarseGrainedExecutorBackend在收到RegisteredExecutor消息以后，new 出来一个Executor。

而Executor就是一个普通的类。

1. private[spark] class Executor(

2. executorId: String,

3. executorHostname: String,

4. env: SparkEnv,

5. userClassPath: Seq[URL] = Nil,

6. isLocal: Boolean = false)

7. extends Logging {

回到ExecutorData.scala，其中的RpcEndpointRef是代理句柄，代理CoarseGrainedExecutorBackend。在Driver中通过ExecutorData封装并注册ExecutorBackend的信息到Driver的内存数据结构executorMapData中：

1. private[cluster] class ExecutorData(

2. val executorEndpoint: RpcEndpointRef,

3. val executorAddress: RpcAddress,

4. override val executorHost: String,

5. var freeCores: Int,

6. override val totalCores: Int,

7. override val logUrlMap: Map[String, String]

8. ) extendsExecutorInfo(executorHost, totalCores, logUrlMap)

Executor注册消息交给了DriverEndpoint，通过DriverEndpoint写数据给我们CoarseGrainedSchedulerBackend里面的数据结构executorMapData，executorMapData是CoarseGrainedSchedulerBackend的成员，因此最终注册给了CoarseGrainedSchedulerBackend。CoarseGrainedSchedulerBackend获得Executor（其实是ExecutorBackend）的注册信息。

实际在执行的时候DriverEndpoint会把信息写入CoarseGrainedSchedulerBackend的内存数据结构executorMapData，所以说最终是注册给了CoarseGrainedSchedulerBackend，也就是说CoarseGrainedSchedulerBackend掌握了为当前程序分配的所有的ExecutorBackend进程，而在每一个ExecutorBackend进行实例中会通过Executor对象来负责具体Task的运行。在运行的时候使用synchronized关键字来保证executorMapData安全的并发写操作。

CoarseGrainedExecutorBackend收到DriverEndpoint发送过来的RegisteredExecutor消息后会启动Executor实例对象，而Executor实例对象是事实上负责真正Task计算的；

1. override def receive: PartialFunction[Any,Unit] = {

2. case RegisteredExecutor =>

3. logInfo("Successfully registeredwith driver")

4. try {

5. executor = new Executor(executorId,hostname, env, userClassPath, isLocal = false)

6. } catch {

7. case NonFatal(e) =>

8. exitExecutor(1, "Unable tocreate executor due to " + e.getMessage, e)

9. }

我们看一下Executor.scala，其中的threadPool是一个线程池，源码如下：

1. private[spark] class Executor(

2. executorId: String,

3. executorHostname: String,

4. env: SparkEnv,

5. userClassPath: Seq[URL] = Nil,

6. isLocal: Boolean = false)

7. extends Logging {

9. .......

10. private val threadPool =ThreadUtils.newDaemonCachedThreadPool("Executor task launch worker")

Executor 是真正负责Task计算的；其在实例化的时候会实例化一个线程池threadPool来准备Task的计算，threadPool是一个newDaemonCachedThreadPool，newDaemonCachedThreadPool 创建线程池，线程工厂按照需要的格式new出线程。语法实现如下：

1. defnewDaemonCachedThreadPool(prefix: String): ThreadPoolExecutor = {

2. val threadFactory =namedThreadFactory(prefix)

3. Executors.newCachedThreadPool(threadFactory).asInstanceOf[ThreadPoolExecutor]

4. }

namedThreadFactory源码如下：

1. defnamedThreadFactory(prefix: String): ThreadFactory = {

2. newThreadFactoryBuilder().setDaemon(true).setNameFormat(prefix +"-%d").build()

3. }

newCachedThreadPool创建一个线程池，根据需要创建新线程，线程池中的线程可以复用，使用提供的ThreadFactory创建新线程。newCachedThreadPool源码如下：

1. public static ExecutorService newCachedThreadPool(ThreadFactorythreadFactory) {

2. return new ThreadPoolExecutor(0,Integer.MAX_VALUE,

3. 60L,TimeUnit.SECONDS,

4. newSynchronousQueue<Runnable>(),

5. threadFactory);

6. }

创建的threadPool中以多线程并发执行和线程复用的方式来高效的执行Spark发过来的Task。线程池创建好以后，接下来是等待Driver发送任务给CoarseGrainedExecutorBackend，不是直接发送给Executor，因为Executor不是一个消息循环体。

Executor具体是如何工作的？

当Driver发送过来Task的时候，其实是发送给了CoarseGrainedExecutorBackend这个RpcEndpoint，而不是直接发送给了Executor（Executor由于不是消息循环体，所以永远也无法直接接受远程发过来的信息）；

CoarseGrainedExecutorBackend中LaunchTask：

1. case LaunchTask(data) =>

2. if (executor == null) {

3. exitExecutor(1, "ReceivedLaunchTask command but executor was null")

4. } else {

5. val taskDesc =ser.deserialize[TaskDescription](data.value)

6. logInfo("Got assigned task "+ taskDesc.taskId)

7. executor.launchTask(this, taskId =taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,

8. taskDesc.name,taskDesc.serializedTask)

9. }

Driver向CoarseGrainedExecutorBackend发送LaunchTask，转过来交给线程池中线程去执行。先判断executor是否为空，executor为空提示错误，进程就直接退出。如果executor不为空，反序列化任务调用executor的launchTask，其中attemptNumber是任务可以重试的次数。

ExecutorBackend在收到Driver中发送过来的消息后会提供调用launchTask来交给Executor去执行：

1. deflaunchTask(

2. context: ExecutorBackend,

3. taskId: Long,

4. attemptNumber: Int,

5. taskName: String,

6. serializedTask: ByteBuffer): Unit = {

7. val tr = new TaskRunner(context, taskId =taskId, attemptNumber = attemptNumber, taskName,

8. serializedTask)

9. runningTasks.put(taskId, tr)

10. threadPool.execute(tr)

11. }

Executor.scala的launchTask接收到Task执行的命令后，首先将Task封装在TaskRunner里面，然后放入到runningTasks，runningTasks是一个简单的数据结构。

1. private val runningTasks = newConcurrentHashMap[Long, TaskRunner]

launchTask中然后交给threadPool.execute(tr)，交给线程池中的线程执行任务。 TaskRunner继承至Runnable，是一个Runnable，而Runnable 是Java的对象。

1. class TaskRunner(

2. execBackend: ExecutorBackend,

3. val taskId: Long,

4. val attemptNumber: Int,

5. taskName: String,

6. serializedTask: ByteBuffer)

7. extends Runnable {

TaskRunner其实是Java中Runnable接口的具体实现，在真正工作的时候会交给线程池中的线程去运行，此时会调用run方法来执行Task。

Executor.scala中的Run方法中最终调用task.run方法：

1. override def run(): Unit = {

2. ......

3. var threwException = true

4. val value = try {

5. val res = task.run(

6. taskAttemptId = taskId,

7. attemptNumber = attemptNumber,

8. metricsSystem = env.metricsSystem)

9. threwException = false

10. res

11. } finally {

12. val releasedLocks =env.blockManager.releaseAllLocksForTask(taskId)

13.

14. ......

跟进Task.scala中run方法，在里面调用runTask：

1. final def run(

2. taskAttemptId: Long,

3. attemptNumber: Int,

4. metricsSystem: MetricsSystem): T = {

5. ……

6. try {

7. runTask(context)

8. } catch {

9. ……

TaskRunner在调用run方法的时候会调用Task的run方法，而Task的run方法会调用runTask，而实际Task有ShuffleMapTask和ResultTask。

阅读全文

0 0