第33课: Spark Executor内幕彻底解密:Executor工作原理图、ExecutorBackend注册源码解密、Executor实例化内幕、Executor具体工作内幕

来源:互联网 发布:linux怎么启动apache 编辑:程序博客网 时间:2024/05/21 07:11


第33课: Spark Executor内幕彻底解密:Executor工作原理图、ExecutorBackend注册源码解密、Executor实例化内幕、Executor具体工作内幕

本节讲解Executor工作原理图、ExecutorBackend注册源码解密、Executor实例化内幕、Executor具体工作内幕 。

Master让Worker启动,启动了一个Executor所在的进程,在Standalone模式中,Executor所在的进程是CoarseGrainedExecutorBackend。

l  Master侧:Master发指令给Worker启动Executor。

l  Worker侧:Worker接收到Master发过来的指令通过ExecutorRunner启动另外一个进程来运行Executor。这里是指启动另外一个进程来启动Executor,而不是直接启动Executor。Master向Worker发送指令,Worker为什么启动另外一个进程,在另外一个进程中注册给Driver,然后启动Executor?因为Worker本身是管理机器上的资源的,机器上资源变动的时候要汇报给Master。Worker不是用来计算的,不能在Woker中进行计算;Spark 集群中有很多应用程序,需要很多Executor,如果不是给每一个Executor启动一个进程,而是所有的Executor都在Worker里面,如果一个程序崩溃将导致其它的程序也崩溃。

l  启动CoarseGrainedExecutorBackend,CoarseGrainedExecutorBackend是Executor所在的进程。CoarseGrainedExecutorBackend启动的时候,需向Driver侧注册。通过发送RegisterExecutor向Driver注册,注册的内容是RegisterExecutor:

CoarseGrainedExecutorBackend.scala的onstart方法源码:

1.        override def onStart() {

2.           logInfo("Connecting to driver: "+ driverUrl)

3.           rpcEnv.asyncSetupEndpointRefByURI(driverUrl).flatMap{ ref =>

4.             // This is a very fast action so we canuse "ThreadUtils.sameThread"

5.             driver = Some(ref)

6.             ref.ask[Boolean](RegisterExecutor(executorId,self, hostname, cores, extractLogUrls))

7.           }(ThreadUtils.sameThread).onComplete {

8.             // This is a very fast action so we canuse "ThreadUtils.sameThread"

9.             case Success(msg) =>

10.             // Always receive`true`. Just ignore it

11.           case Failure(e) =>

12.             exitExecutor(1,s"Cannot register with driver: $driverUrl", e, notifyDriver = false)

13.         }(ThreadUtils.sameThread)

14.       }

其中RegisterExecutor是一个case class,源码为:

1.            case class RegisterExecutor(

2.             executorId: String,

3.             executorRef: RpcEndpointRef,

4.             hostname: String,

5.             cores: Int,

6.             logUrls: Map[String, String])

7.           extends CoarseGrainedClusterMessage

 

CoarseGrainedExecutorBackend启动的时候,向Driver发送RegisterExecutor消息进行注册;Driver收到RegisterExecutor消息,在Executor注册成功后会返回消息RegisteredExecutor给CoarseGrainedExecutorBackend。这里注册的Executor和真正工作的Executor没有任何关系,其实注册的是RegisterExecutorBackend,可以将RegisteredExecutor名字理解为RegisterExecutorBackend。

         需要特别注意是在CoarseGrainedExecutorBackend启动时向Driver注册Executor其实质是注册ExecutorBackend实例,和Executor实例之间没有直接的关系!

l  CoarseGrainedExecutorBackend是Executor运行所在的进程名称,CoarseGrainedExecutorBackend本身不会完成任务的计算;

l  Executor才是正在处理Task的对象。Executor内部是通过线程池的方式来完成Task的计算的;Executor对象运行于CoarseGrainedExecutorBackend进程。

l  CoarseGrainedExecutorBackend和Executor是一一对应的。

l  CoarseGrainedExecutorBackend是一个消息通信体(其具体实现了ThreadSafeRPCEndpoint),可以发送信息给Driver并可以接受Driver中发过来的指令,例如启动Task等。

CoarseGrainedExecutorBackend继承至ThreadSafeRpcEndpoint,CoarseGrainedExecutorBackend是一个消息通信体,可以收消息,也可以发消息。源码如下:

1.           private[spark] classCoarseGrainedExecutorBackend(

2.           override val rpcEnv: RpcEnv,

3.           driverUrl: String,

4.           executorId: String,

5.           hostname: String,

6.           cores: Int,

7.           userClassPath: Seq[URL],

8.           env: SparkEnv)

9.         extends ThreadSafeRpcEndpoint withExecutorBackend with Logging {

 

CoarseGrainedExecutorBackend发消息给我们的Driver,Driver在StandaloneSchedulerBackend里面(spark 2.0中已将SparkDeploySchedulerBackend更名为StandaloneSchedulerBackend),StandaloneSchedulerBackend继承至CoarseGrainedSchedulerBackend, start启动的时候启动StandaloneAppClient,StandaloneAppClient(在Spark 2.0中将AppClient更名为StandaloneAppClient),代表应用程序本身。。

StandaloneSchedulerBackend.scala的start方法源码如下:

1.           overridedef start() {

2.           super.start()

3.           launcherBackend.connect()

4.        

5.           // The endpoint for executors to talk to us

6.           val driverUrl = RpcEndpointAddress(

7.             sc.conf.get("spark.driver.host"),

8.             sc.conf.get("spark.driver.port").toInt,

9.             CoarseGrainedSchedulerBackend.ENDPOINT_NAME).toString

10.         val args = Seq(

11.           "--driver-url", driverUrl,

12.           "--executor-id","{{EXECUTOR_ID}}",

13.           "--hostname","{{HOSTNAME}}",

14.           "--cores","{{CORES}}",

15.           "--app-id","{{APP_ID}}",

16.           "--worker-url","{{WORKER_URL}}")

17.         val extraJavaOpts =sc.conf.getOption("spark.executor.extraJavaOptions")

18.           .map(Utils.splitCommandString).getOrElse(Seq.empty)

19.         val classPathEntries =sc.conf.getOption("spark.executor.extraClassPath")

20.           .map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)

21.         val libraryPathEntries =sc.conf.getOption("spark.executor.extraLibraryPath")

22.           .map(_.split(java.io.File.pathSeparator).toSeq).getOrElse(Nil)

23.      

24.         // When testing, exposethe parent class path to the child. This is processed by

25.         //compute-classpath.{cmd,sh} and makes all needed jars available to childprocesses

26.         // when the assembly isbuilt with the "*-provided" profiles enabled.

27.         val testingClassPath =

28.           if(sys.props.contains("spark.testing")) {

29.             sys.props("java.class.path").split(java.io.File.pathSeparator).toSeq

30.           } else {

31.             Nil

32.           }

33.      

34.         // Start executors with afew necessary configs for registering with the scheduler

35.         val sparkJavaOpts =Utils.sparkJavaOpts(conf, SparkConf.isExecutorStartupConf)

36.         val javaOpts =sparkJavaOpts ++ extraJavaOpts

37.         val command =Command("org.apache.spark.executor.CoarseGrainedExecutorBackend",

38.           args, sc.executorEnvs,classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts)

39.         val appUIAddress =sc.ui.map(_.appUIAddress).getOrElse("")

40.         val coresPerExecutor =conf.getOption("spark.executor.cores").map(_.toInt)

41.         // If we're using dynamicallocation, set our initial executor limit to 0 for now.

42.         //ExecutorAllocationManager will send the real initial limit to the Master later.

43.         val initialExecutorLimit =

44.           if(Utils.isDynamicAllocationEnabled(conf)) {

45.             Some(0)

46.           } else {

47.             None

48.           }

49.         val appDesc = newApplicationDescription(sc.appName, maxCores, sc.executorMemory, command,

50.           appUIAddress,sc.eventLogDir, sc.eventLogCodec, coresPerExecutor, initialExecutorLimit)

51.         client = newStandaloneAppClient(sc.env.rpcEnv, masters, appDesc, this, conf)

52.         client.start()

53.         launcherBackend.setState(SparkAppHandle.State.SUBMITTED)

54.         waitForRegistration()

55.         launcherBackend.setState(SparkAppHandle.State.RUNNING)

56.       }

 

 看一下StandaloneAppClient的源码:

1.           private[spark]class StandaloneAppClient(

2.           rpcEnv: RpcEnv,

3.           masterUrls: Array[String],

4.           appDescription: ApplicationDescription,

5.           listener: StandaloneAppClientListener,

6.           conf: SparkConf)

7.         extends Logging {

8.       ……

9.       private classClientEndpoint(override val rpcEnv: RpcEnv) extends ThreadSafeRpcEndpoint

10.         with Logging {

11.     ……

 

在Driver进程有两个至关重要的Endpoint:

l  ClientEndpoint:主要负责向Master注册当前的程序,是AppClient的内部成员;

l  DriverEndpoint:这是整个程序运行时候的驱动器,是CoarseGrainedExecutorBackend的内部成员;

CoarseGrainedSchedulerBackend的DriverEndpoint

1.           class DriverEndpoint(override val rpcEnv:RpcEnv, sparkProperties: Seq[(String, String)])

2.           extends ThreadSafeRpcEndpoint with Logging{

3.        

 

在DriverEndpoint会接收到RegisterExecutor消息,并完成在Driver上的注册。

CoarseGrainedSchedulerBackend的RegisterExecutor源码如下:

1.          case RegisterExecutor(executorId,executorRef, hostname, cores, logUrls) =>

2.               if(executorDataMap.contains(executorId)) {

3.                 executorRef.send(RegisterExecutorFailed("Duplicateexecutor ID: " + executorId))

4.                 context.reply(true)

5.               } else {

6.                 // If the executor's rpc env is notlistening for incoming connections, `hostPort`

7.                 // will be null, and the clientconnection should be used to contact the executor.

8.                 val executorAddress = if(executorRef.address != null) {

9.                     executorRef.address

10.                 } else {

11.                   context.senderAddress

12.                 }

13.               logInfo(s"Registered executor$executorRef ($executorAddress) with ID $executorId")

14.               addressToExecutorId(executorAddress)= executorId

15.               totalCoreCount.addAndGet(cores)

16.               totalRegisteredExecutors.addAndGet(1)

17.               val data = newExecutorData(executorRef, executorRef.address, hostname,

18.                 cores, cores,logUrls)

19.               // This must besynchronized because variables mutated

20.               // in this block areread when requesting executors

21.               CoarseGrainedSchedulerBackend.this.synchronized{

22.                 executorDataMap.put(executorId,data)

23.                 if (currentExecutorIdCounter <executorId.toInt) {

24.                   currentExecutorIdCounter =executorId.toInt

25.                 }

26.                 if(numPendingExecutors > 0) {

27.                   numPendingExecutors -= 1

28.                   logDebug(s"Decrementednumber of pending executors ($numPendingExecutors left)")

29.                 }

30.               }

31.               executorRef.send(RegisteredExecutor)

32.               // Note: some testsexpect the reply to come after we put the executor in the map

33.               context.reply(true)

34.               listenerBus.post(

35.                 SparkListenerExecutorAdded(System.currentTimeMillis(),executorId, data))

36.               makeOffers()

37.             }

 

RegisterExecutor其中有个数据结构executorDataMap,是Key-Value的方式。

1.               privateval executorDataMap = new HashMap[String, ExecutorData]

ExecutorData中的executorEndpoint是RpcEndpointRef ,ExecutorData的源码如下:

1.        private[cluster] class ExecutorData(

2.          val executorEndpoint: RpcEndpointRef,

3.          val executorAddress: RpcAddress,

4.          override val executorHost: String,

5.          var freeCores: Int,

6.          override val totalCores: Int,

7.          override val logUrlMap: Map[String, String]

8.       ) extendsExecutorInfo(executorHost, totalCores, logUrlMap)

 

看一下CoarseGrainedExecutorBackend.scala的RegisteredExecutor源码:

1.        override def receive: PartialFunction[Any,Unit] = {

2.           case RegisteredExecutor =>

3.             logInfo("Successfully registeredwith driver")

4.             try {

5.               executor = new Executor(executorId,hostname, env, userClassPath, isLocal = false)

6.             } catch {

7.               case NonFatal(e) =>

8.                 exitExecutor(1, "Unable tocreate executor due to " + e.getMessage, e)

9.            

 

CoarseGrainedExecutorBackend在收到RegisteredExecutor消息以后,new 出来一个Executor。

而Executor就是一个普通的类。

1.          private[spark] class Executor(

2.           executorId: String,

3.           executorHostname: String,

4.           env: SparkEnv,

5.           userClassPath: Seq[URL] = Nil,

6.           isLocal: Boolean = false)

7.         extends Logging {

 

回到ExecutorData.scala,其中的RpcEndpointRef是代理句柄,代理CoarseGrainedExecutorBackend。在Driver中通过ExecutorData封装并注册ExecutorBackend的信息到Driver的内存数据结构executorMapData中:

1.        private[cluster] class ExecutorData(

2.          val executorEndpoint: RpcEndpointRef,

3.          val executorAddress: RpcAddress,

4.          override val executorHost: String,

5.          var freeCores: Int,

6.          override val totalCores: Int,

7.          override val logUrlMap: Map[String, String]

8.       ) extendsExecutorInfo(executorHost, totalCores, logUrlMap)

 

Executor注册消息交给了DriverEndpoint,通过DriverEndpoint写数据给我们CoarseGrainedSchedulerBackend里面的数据结构executorMapData,executorMapData是CoarseGrainedSchedulerBackend的成员,因此最终注册给了CoarseGrainedSchedulerBackend。CoarseGrainedSchedulerBackend获得Executor(其实是ExecutorBackend)的注册信息。

实际在执行的时候DriverEndpoint会把信息写入CoarseGrainedSchedulerBackend的内存数据结构executorMapData,所以说最终是注册给了CoarseGrainedSchedulerBackend,也就是说CoarseGrainedSchedulerBackend掌握了为当前程序分配的所有的ExecutorBackend进程,而在每一个ExecutorBackend进行实例中会通过Executor对象来负责具体Task的运行。在运行的时候使用synchronized关键字来保证executorMapData安全的并发写操作。

CoarseGrainedExecutorBackend收到DriverEndpoint发送过来的RegisteredExecutor消息后会启动Executor实例对象,而Executor实例对象是事实上负责真正Task计算的;

 


 

1.           override def receive: PartialFunction[Any,Unit] = {

2.           case RegisteredExecutor =>

3.             logInfo("Successfully registeredwith driver")

4.             try {

5.               executor = new Executor(executorId,hostname, env, userClassPath, isLocal = false)

6.             } catch {

7.               case NonFatal(e) =>

8.                 exitExecutor(1, "Unable tocreate executor due to " + e.getMessage, e)

9.             }

 

我们看一下Executor.scala,其中的threadPool是一个线程池,源码如下:

1.          private[spark] class Executor(

2.           executorId: String,

3.           executorHostname: String,

4.           env: SparkEnv,

5.           userClassPath: Seq[URL] = Nil,

6.           isLocal: Boolean = false)

7.         extends Logging {

8.        

9.        .......

10.       private val threadPool =ThreadUtils.newDaemonCachedThreadPool("Executor task launch worker")

 

Executor 是真正负责Task计算的;其在实例化的时候会实例化一个线程池threadPool来准备Task的计算,threadPool是一个newDaemonCachedThreadPool,newDaemonCachedThreadPool 创建线程池,线程工厂按照需要的格式new出线程。语法实现如下:

1.          defnewDaemonCachedThreadPool(prefix: String): ThreadPoolExecutor = {

2.           val threadFactory =namedThreadFactory(prefix)

3.           Executors.newCachedThreadPool(threadFactory).asInstanceOf[ThreadPoolExecutor]

4.         }

 

namedThreadFactory源码如下:

1.          defnamedThreadFactory(prefix: String): ThreadFactory = {

2.           newThreadFactoryBuilder().setDaemon(true).setNameFormat(prefix +"-%d").build()

3.         }

newCachedThreadPool创建一个线程池,根据需要创建新线程,线程池中的线程可以复用,使用提供的ThreadFactory创建新线程。newCachedThreadPool源码如下:

1.               public static ExecutorService newCachedThreadPool(ThreadFactorythreadFactory) {

2.               return new ThreadPoolExecutor(0,Integer.MAX_VALUE,

3.                                             60L,TimeUnit.SECONDS,

4.                                             newSynchronousQueue<Runnable>(),

5.                                             threadFactory);

6.           }

 

创建的threadPool中以多线程并发执行和线程复用的方式来高效的执行Spark发过来的Task。线程池创建好以后,接下来是等待Driver发送任务给CoarseGrainedExecutorBackend,不是直接发送给Executor,因为Executor不是一个消息循环体。

 

 

Executor具体是如何工作的?

当Driver发送过来Task的时候,其实是发送给了CoarseGrainedExecutorBackend这个RpcEndpoint,而不是直接发送给了Executor(Executor由于不是消息循环体,所以永远也无法直接接受远程发过来的信息);

CoarseGrainedExecutorBackend中LaunchTask:

1.           case LaunchTask(data) =>

2.             if (executor == null) {

3.               exitExecutor(1, "ReceivedLaunchTask command but executor was null")

4.             } else {

5.               val taskDesc =ser.deserialize[TaskDescription](data.value)

6.               logInfo("Got assigned task "+ taskDesc.taskId)

7.               executor.launchTask(this, taskId =taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,

8.                 taskDesc.name,taskDesc.serializedTask)

9.             }   

 

Driver向CoarseGrainedExecutorBackend发送LaunchTask,转过来交给线程池中线程去执行。先判断executor是否为空,executor为空提示错误,进程就直接退出。如果executor不为空,反序列化任务调用executor的launchTask,其中attemptNumber是任务可以重试的次数。

ExecutorBackend在收到Driver中发送过来的消息后会提供调用launchTask来交给Executor去执行:

1.         deflaunchTask(

2.             context: ExecutorBackend,

3.             taskId: Long,

4.             attemptNumber: Int,

5.             taskName: String,

6.             serializedTask: ByteBuffer): Unit = {

7.           val tr = new TaskRunner(context, taskId =taskId, attemptNumber = attemptNumber, taskName,

8.             serializedTask)

9.           runningTasks.put(taskId, tr)

10.         threadPool.execute(tr)

11.       }  

 

Executor.scala的launchTask接收到Task执行的命令后,首先将Task封装在TaskRunner里面,然后放入到runningTasks,runningTasks是一个简单的数据结构。

1.             private val runningTasks = newConcurrentHashMap[Long, TaskRunner]

 

launchTask中然后交给threadPool.execute(tr),交给线程池中的线程执行任务。   TaskRunner继承至Runnable,是一个Runnable,而Runnable 是Java的对象。

1.         class TaskRunner(

2.             execBackend: ExecutorBackend,

3.             val taskId: Long,

4.             val attemptNumber: Int,

5.             taskName: String,

6.             serializedTask: ByteBuffer)

7.           extends Runnable {

 

TaskRunner其实是Java中Runnable接口的具体实现,在真正工作的时候会交给线程池中的线程去运行,此时会调用run方法来执行Task。

   Executor.scala中的Run方法中最终调用task.run方法:

1.          override def run(): Unit = {

2.        ......

3.            var threwException = true

4.               val value = try {

5.                 val res = task.run(

6.                   taskAttemptId = taskId,

7.                   attemptNumber = attemptNumber,

8.                   metricsSystem = env.metricsSystem)

9.                 threwException = false

10.               res

11.             } finally {

12.               val releasedLocks =env.blockManager.releaseAllLocksForTask(taskId)

13.              

14.     ......

跟进Task.scala中run方法,在里面调用runTask:

1.          final def run(

2.             taskAttemptId: Long,

3.             attemptNumber: Int,

4.             metricsSystem: MetricsSystem): T = {

5.          ……

6.           try {

7.             runTask(context)

8.           } catch {

9.         ……

 

TaskRunner在调用run方法的时候会调用Task的run方法,而Task的run方法会调用runTask,而实际Task有ShuffleMapTask和ResultTask。

 



阅读全文
0 0