Spark源码分析之TaskSchedule和SparkDeployScheduleBackended的初始化

来源：互联网发布：自己的淘宝店铺在哪里编辑：程序博客网时间：2024/06/18 13:29

在这里我们是使用的spark1.5.1的源代码进行分析的

什么是SparkContext

在编写一个Spark程序的时候，我们总是会新建一个SparkContext对象，这个对象，下面是官方对这个对象的解释

Main entry point for Spark functionality. A SparkContext represents the connection to a Spark cluster, and can be used to create RDDs, accumulators and broadcast variables on that cluster.Only one SparkContext may be active per JVM.  You must `stop()` the active SparkContext before creating a new one.  This limitation may eventually be removed; see SPARK-2243 for more details.

翻译如下：

spark的方法的主要接入点，sparkContext代表着对saprk集群的连接，并且可以用来在spark集群中创建RDD，accumulators和广播变量一个JVM中只能创建一个sc对象注意：该限制可能会被废除

SparkContext的初始化

那么，一个sc对象创建的流程是怎样的呢？
一个sparkContext对象创建的时候，首先创建TaskSchedule和DAGSchedule

什么是TaskSchedule

什么是TaskSchedule？

官网的解释
Low-level task scheduler interface, currently implemented exclusively by TaskSchedulerImpl.This interface allows plugging in different task schedulers. Each TaskScheduler schedules tasks for a single SparkContext. These schedulers get sets of tasks submitted to them from the DAGScheduler for each stage, and are responsible for sending the tasks to the cluster, running them, retrying if there are failures, and mitigating stragglers. They return events to the DAGScheduler.

TaskSchedule是一个低层次的任务调度接口，目前只有TaskScheduleImpl实现了它，这个接口允许使用不同的任务调度策略。每一个任务调度器只服务于一个SparkContext。TaskSchedule会从DAGSchedule那边获取每一个stage的tasks的集合，并且会负责将它们提交到集群上去运行，还会在任务失败的时候重新提交它们

TaskScheduleImpl

官网的解释
Schedules tasks for multiple types of clusters by acting through a SchedulerBackend.
It can also work with a local setup by using a LocalBackend and setting isLocal to true.
It handles common logic, like determining a scheduling order across jobs, waking up to launch
speculative tasks, etc.
Clients should first call initialize() and start(), then submit task sets through the
runTasks method.

TaskSchedule通过ScheduleBackend来调度不同类型的集群的任务
可以使用LocalBackend来处理local集群的任务
处理一些普通的逻辑，比如确定job之间的调度顺序，唤醒一些预测的任务
应该首先调用initialize和start方法，然后通过runTasks方法提交任务集

从TaskSchedule和SparkDeployScheduleBackended初始化到Application完成注册的流程

从TaskSchedule和SparkDeployScheduleBackended初始化到Application注册的基本流程

TaskSchedule的创建的源码

class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationClient {    // Create and start the schedule    val (sched, ts) = SparkContext.createTaskScheduler(this, master)    _schedulerBackend = sched    _taskScheduler = ts    _dagScheduler = new DAGScheduler(this)    _heartbeatReceiver.ask[Boolean](TaskSchedulerIsSet)    // start TaskScheduler after taskScheduler sets DAGScheduler reference in DAGScheduler's    // constructor    _taskScheduler.start()    private def createTaskScheduler(sc: SparkContext,master: String): (SchedulerBackend, TaskScheduler) = {        master match {            //根据不同的部署模式，会创建不弄的backend，这里我们分析的是standalone模式        case SPARK_REGEX(sparkUrl) =>            //创建TaskSchedulerImpl的对象            val scheduler = new TaskSchedulerImpl(sc)            val masterUrls = sparkUrl.split(",").map("spark://" + _)            //创建一个SparkDeploySchedulerBackend            val backend = new SparkDeploySchedulerBackend(scheduler, sc, masterUrls)            //将backend注入到scheduler中            scheduler.initialize(backend)            (backend, scheduler)        }     }}

从上面的代码可以看出，在SparkContext刚刚创建的时候，就会创建taskSchedule，在createTaskSchedule的时候就会根据集群的模式进行构建不同的backend，这里我们分析的是standalone模式的，所以构建SparkDeployScheduleBackend

什么是sparkDeployScheduleBackend？
在standalone模式下，SparkDeploySchedulerBackend负责集群资源的获取和调度，其内部有一个AppClient，其内部有一个start方法，在该start方法内部会创建一个RpcEndpointRef进行通信（RpcEndpointRef有两种实现机制，但是spark1.5.1中只有基于AKKA的实现，没有基于Netty的）

createTaskschedule方法执行完毕后就会执行它的start方法，taskScheduleImpl的方法如下：

override def start() {    //在该方法中主要就是调用了backend的start方法    backend.start()    if (!isLocal && conf.getBoolean("spark.speculation", false)) {      logInfo("Starting speculative execution thread")      speculationScheduler.scheduleAtFixedRate(new Runnable {        override def run(): Unit = Utils.tryOrStopSparkContext(sc) {          checkSpeculatableTasks()        }      }, SPECULATION_INTERVAL_MS, SPECULATION_INTERVAL_MS, TimeUnit.MILLISECONDS)    }  }

可见在这个方法内部主要是调用了注册到taskScheduleImpl中的backend的start方法，下面是SparkDeployScheduleBackend的start方法

override def start() {    super.start()    //初始化一些参数，例如java虚拟机的参数，classpath等信息    //构建ApplicationDescription对象    val appDesc = new ApplicationDescription(sc.appName, maxCores, sc.executorMemory,      command, appUIAddress, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor)    //使用一些配置信息创建一个AppClient对象    client = new AppClient(sc.env.rpcEnv, masters, appDesc, this, conf)    client.start()    waitForRegistration()}

在这里我们就看到了backend内部有一个AppClient，该对象会与其它组件进行通信，看一下它的start方法

  def start() {    // Just launch an rpcEndpoint; it will call back into the listener.    //rpcEnv，AkkaRpcEnv继承自该抽象类，在setupEndpoint内部会创建一个actor用于通信,在spark1.5.1中还没有netty的实现    endpoint = rpcEnv.setupEndpoint("AppClient", new ClientEndpoint(rpcEnv))  }

这边的rpcEnv是一个trait，这边我们查看其实现类AkkaRpcEnv的setupEndpoint方法

override def setupEndpoint(name: String, endpoint: RpcEndpoint): RpcEndpointRef = {    lazy val actorRef = actorSystem.actorOf(Props(new Actor with ActorLogReceive with Logging {        //根据akka的特性，我们知道preStart方法会在actor的构造函数执行完毕后，执行，也就意味这，接下来会执行endpoint的onStart方法        override def preStart(): Unit = {        // Listen for remote client network events        context.system.eventStream.subscribe(self, classOf[AssociationEvent])        safelyCall(endpoint) {          endpoint.onStart()        }      }    }}

在该方法中，我们看到创建了一个actor，这个actor的prestart的方法执行了endpoint的onStart方法，endpoint就是rpcEnv.setupEndpoint(“AppClient”, new ClientEndpoint(rpcEnv))中的第二个参数，就是AppClient中的一个私有类，其onStart方法如下：

override def onStart(): Unit = {  try {    //actor初始化后，就可以向Master进行注册了    registerWithMaster(1)  } catch {    case e: Exception =>      logWarning("Failed to connect to master", e)      markDisconnected()      stop()  }}

在这个方法中调用了registerWithMaster方法，在这个方法里面实现application向Master的注册，下面是其源码

private def registerWithMaster(nthRetry: Int) {//一个集群可能会有多个Master，所以这里我们尝试向每一个Master注册  registerMasterFutures = tryRegisterAllMasters()  //以固定的时间注册一次，直到注册成功或者尝试的次数达到上限  /**  ScheduledExecutorService的scheduleAtFixedRate方法的参数解析  @param command：将要定期被执行的任务  @param initialDelay：第一次执行任务之前停留的时间  @param peroid：上一次任务开始执行距离这一次任务开始执行的时间，注意这里是开始执行的时间，而不是完成的时间，所以这期间会存在并发  @param unit：时间的单位  */  registrationRetryTimer = registrationRetryThread.scheduleAtFixedRate(new Runnable {    override def run(): Unit = {      Utils.tryOrExit {        if (registered) {          registerMasterFutures.foreach(_.cancel(true))          registerMasterThreadPool.shutdownNow()        } else if (nthRetry >= REGISTRATION_RETRIES) {          markDead("All masters are unresponsive! Giving up.")        } else {          registerMasterFuture.sforeach(_.cancel(true))          registerWithMaster(nthRetry + 1)        }      }    }  }, REGISTRATION_TIMEOUT_SECONDS, REGISTRATION_TIMEOUT_SECONDS, TimeUnit.SECONDS)}

这边会尝试着向每一个Master进行注册，然后设置了一个定时任务进行注册，下面是tryRegisterAllMasters的源代码

private def tryRegisterAllMasters(): Array[JFuture[_]] = {  for (masterAddress <- masterRpcAddresses) yield {    registerMasterThreadPool.submit(new Runnable {      override def run(): Unit = try {        if (registered) {          return        }        logInfo("Connecting to master " + masterAddress.toSparkURL + "...")        val masterRef =          rpcEnv.setupEndpointRef(Master.SYSTEM_NAME, masterAddress, Master.ENDPOINT_NAME)        //向master发送RegisterApplication消息        masterRef.send(RegisterApplication(appDescription, self))      } catch {        case ie: InterruptedException => // Cancelled        case NonFatal(e) => logWarning(s"Failed to connect to master $masterAddress", e)      }    })  }}

注册好后会向master发送RegisterApplication消息，master中处理该消息的代码如下：

override def receive: PartialFunction[Any, Unit] = { case RegisterApplication(description, driver) => { //如果Master是standBy，那么是么都不做  if (state == RecoveryState.STANDBY) {    // ignore, don't send response  } else {    logInfo("Registering app " + description.name)    //使用传递过来的description构建ApplicationInfo    val app = createApplication(description, driver)    //执行ApplicationInfo的注册    /*    * 1.将ApplicationInfo加载到内存中的队列中    * 2.将ApplicationInfo加载到等待队列中    * */    registerApplication(app)    logInfo("Registered app " + description.name + " with ID " + app.id)    //将构建好的ApplicationInfo持久化    persistenceEngine.addApplication(app)    //向Driver发送registeredApplication消息    driver.send(RegisteredApplication(app.id, self))    //为当前的application进行调度    schedule()  } }}

从这边我们可以看出这个driver就是masterRef.send(RegisterApplication(appDescription, self))中的self，这个self就是ClientEndPoint，就是用来通信的
在将application添加到等待队列后，会执行schedule进行调度

阅读全文

0 0