31 Spark资源调度
来源:互联网 发布:淘宝怎么查看粉丝是谁 编辑:程序博客网 时间:2024/06/05 09:18
内容:
1.分配Driver(Cluster模式)
2.为Application分配资源
3.两种不同的资源分配方式
4.Spark资源分配的思考
一、任务调度与资源调度的区别
1.任务调度
任务调度是(指Task set怎么)通过DAGScheduler、TaskScheduler、SchedulerBackend等进行(运行)的。
2.资源调度
资源调度是指应用程序如何获得资源的。
3.关系
任务调度是在资源调度的基础上进行的,没有资源调度,那么任务调度就成了无源之水。
二、资源调度
1.Master负责资源调度
因为Master负责资源管理与调度,所以资源调度的方法Scheduler位于Master.scala这个类中,当注册过的程序或资源发生改变时会导致Schedule的调用。例如注册程序时:
case RegisterApplication(description, driver) => { // TODO Prevent repeated registrations from some driver if (state == RecoveryState.STANDBY) { // ignore, don't send response } else { logInfo("Registering app " + description.name) val app = createApplication(description, driver) registerApplication(app) logInfo("Registered app " + description.name + " with ID " + app.id) persistenceEngine.addApplication(app) driver.send(RegisteredApplication(app.id, self)) schedule() }
2.scheduler调用的时机
Scheduler调用的时机,每次有新的应用程序提交或者集群的资源状况发生改变时(包括Executor增加或减少、worker增加或减少等)。
3.Master可调度的条件
当前Master必须是Alive的方式才能进行资源的调度,若不是alive状态就会直接返回,也就是说standby Master是不会进行Application的资源调度。
private def schedule(): Unit = { if (state != RecoveryState.ALIVE) { return } // Drivers take strict precedence over executors val shuffledWorkers = Random.shuffle(workers) // Randomization helps balance drivers for (worker <- shuffledWorkers if worker.state == WorkerState.ALIVE) { for (driver <- waitingDrivers) { if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) { launchDriver(worker, driver) waitingDrivers -= driver } } } startExecutorsOnWorkers()}
4.Worker信息的处理
使用Random.shuffle把Master中保存的集群中所有worker的信息随机打乱。
private def schedule(): Unit = { if (state != RecoveryState.ALIVE) { return } // Drivers take strict precedence over executors val shuffledWorkers = Random.shuffle(workers) // Randomization helps balance drivers for (worker <- shuffledWorkers if worker.state == WorkerState.ALIVE) { for (driver <- waitingDrivers) { if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) { launchDriver(worker, driver) waitingDrivers -= driver } } } startExecutorsOnWorkers()}其算法内部是循环随机交换所有的worker在Master的缓存数据结构中的位置。
/** Returns a new collection of the same type in a randomly chosen order. * * @return the shuffled collection */def shuffle[T, CC[X] <: TraversableOnce[X]](xs: CC[T])(implicit bf: CanBuildFrom[CC[T], T, CC[T]]): CC[T] = { val buf = new ArrayBuffer[T] ++= xs def swap(i1: Int, i2: Int) { val tmp = buf(i1) buf(i1) = buf(i2) buf(i2) = tmp } for (n <- buf.length to 2 by -1) { val k = nextInt(n) swap(n - 1, k) } (bf(xs) ++= buf).result}
5.判断worker的状态
接着,会判断所有worker中哪些是Alive级别的worker,Alive级别的worker才能参与资源的分配工作。
private def schedule(): Unit = { if (state != RecoveryState.ALIVE) { return } // Drivers take strict precedence over executors val shuffledWorkers = Random.shuffle(workers) // Randomization helps balance drivers for (worker <- shuffledWorkers if worker.state == WorkerState.ALIVE) { for (driver <- waitingDrivers) { if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) { launchDriver(worker, driver) waitingDrivers -= driver } } } startExecutorsOnWorkers()}
6.Cluster模式下的提交应用程序
当SparkSubmit时,指定Driver是Cluster模式情况下,此时Driver会加入waitingDrivers等待列表中,在每个DriverInfo数据结构中的DriverDescription里有保存着启动Driver时对Worker的内存和Cores的要求等内容。
private[deploy] class DriverInfo( val startTime: Long, val id: String, val desc: DriverDescription, val submitDate: Date) extends Serializable {private[deploy] case class DriverDescription( jarUrl: String, mem: Int, cores: Int, supervise: Boolean, command: Command) { override def toString: String = s"DriverDescription (${command.mainClass})"}
在符合资源要求的情况下,会选取随机打乱后的Worker集合中的一个worker来启动Driver。
private def schedule(): Unit = { if (state != RecoveryState.ALIVE) { return } // Drivers take strict precedence over executors val shuffledWorkers = Random.shuffle(workers) // Randomization helps balance drivers for (worker <- shuffledWorkers if worker.state == WorkerState.ALIVE) { for (driver <- waitingDrivers) { if (worker.memoryFree >= driver.desc.mem && worker.coresFree >= driver.desc.cores) { launchDriver(worker, driver) waitingDrivers -= driver } } } startExecutorsOnWorkers()}Master发指令给worker让远程的worker启动Driver;
launchDriver(worker, driver)
private def launchDriver(worker: WorkerInfo, driver: DriverInfo) { logInfo("Launching driver " + driver.id + " on worker " + worker.id) worker.addDriver(driver) driver.worker = Some(worker) worker.endpoint.send(LaunchDriver(driver.id, driver.desc)) driver.state = DriverState.RUNNING}
7.资源调度开始
先启动Driver才会发生后续的一切资源调度模式。
8.资源调度模式(StandAlone)
Spark默认为应用程序启动Executor的方式是FIFO的方式,也就是说所有的提交的应用程序都是放在调度队列中的,先进先出,只有满足了前面应用程序的资源分配的基础上,才能够为下一个应用程序进行资源分配。
9.Executor及Cores分配过程
(1)判断应用程序资源是否以满足
首先,在位应用程序分配Executor之前需要进行判断,应用程序是否还需要分配Cores,如果不需要则不会为应用程序分配Executor;
(2)分配条件
在具体分配Executor之前要求Worker的状态必须是ALIVE且必须满足Application对每个Executor的内存和Cores的要求,并且在此基础上进行排序产生计算资源有小到大的useableWorkers数据结构;(Master.scala)
private def startExecutorsOnWorkers(): Unit = { // Right now this is a very simple FIFO scheduler. We keep trying to fit in the first app // in the queue, then the second app, etc. for (app <- waitingApps if app.coresLeft > 0) { val coresPerExecutor: Option[Int] = app.desc.coresPerExecutor val usableWorkers = workers.toArray.filter(_.state == WorkerState.ALIVE) .filter(worker => worker.memoryFree >= app.desc.memoryPerExecutorMB && worker.coresFree >= coresPerExecutor.getOrElse(1)) .sortBy(_.coresFree).reverse val assignedCores = scheduleExecutorsOnWorkers(app, usableWorkers, spreadOutApps) // Now that we've decided how many cores to allocate on each worker, let's allocate them for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) { allocateWorkerResourceToExecutors( app, assignedCores(pos), coresPerExecutor, usableWorkers(pos)) } }
在FIFO的情况下默认是spreadOutApps来让应用程序尽可能多的运行在所有的Node上;
// As a temporary workaround before better ways of configuring memory, we allow users to set// a flag that will perform round-robin scheduling across the nodes (spreading out each app// among all the nodes) instead of trying to consolidate each app onto a small # of nodes.private val spreadOutApps = conf.getBoolean("spark.deploy.spreadOut", true)
(3)分配方式
为应用程序分配Executors有两种方式,第一种方式是尽可能在集群的所有worker上分配Executor,这种方式往往会带来潜在的更好的数据本地性(这种情况下,有利于应用的并发计算)。
具体在集群上分配cores的时候会尽可能的满足我们的要求,所以下面求了一个最小值:
var coresToAssign = math.min(app.coresLeft, usableWorkers.map(_.coresFree).sum)
如果是每个worker下面只能为当前的应用程序分配一个Executor的话,每次是分配一个core。
/** * Schedule executors to be launched on the workers. * Returns an array containing number of cores assigned to each worker…….*/private def scheduleExecutorsOnWorkers( app: ApplicationInfo, usableWorkers: Array[WorkerInfo], spreadOutApps: Boolean): Array[Int] = {……// If we are launching one executor per worker, then every iteration assigns 1 core// to the executor. Otherwise, every iteration assigns cores to a new executor.if (oneExecutorPerWorker) { assignedExecutors(pos) = 1} else { assignedExecutors(pos) += 1}……}
(4)Master通知worker
准备好要为当前的应用程序分配的Executor信息后,Master要通过远程通信发指令给Worker来具体启动ExecutorBackend进程,来具体负责启动Executor。private def startExecutorsOnWorkers(): Unit = { …… // Now that we've decided how many cores to allocate on each worker, let's allocate them for (pos <- 0 until usableWorkers.length if assignedCores(pos) > 0) { allocateWorkerResourceToExecutors( app, assignedCores(pos), coresPerExecutor, usableWorkers(pos)) } }private def allocateWorkerResourceToExecutors(……){……for (i <- 1 to numExecutors) { val exec = app.addExecutor(worker, coresToAssign) launchExecutor(worker, exec) app.state = ApplicationState.RUNNING }}private def launchExecutor(worker: WorkerInfo, exec: ExecutorDesc): Unit = { logInfo("Launching executor " + exec.fullId + " on worker " + worker.id) worker.addExecutor(exec) worker.endpoint.send(LaunchExecutor(masterUrl, exec.application.id, exec.id, exec.application.desc, exec.cores, exec.memory)) exec.application.driver.send( ExecutorAdded(exec.id, worker.id, worker.hostPort, exec.cores, exec.memory))}
(5)通知Driver
如上代码,最后Master会给我们应用程序的Driver发送一个ExecutorAdded的信息。
------------------------------------------EOF--------------------------------------------------------------------------
- 31 Spark资源调度
- Spark集群资源调度
- Spark资源调度
- Spark中的资源调度
- Spark资源调度
- spark资源调度分配
- Spark资源调度机制流程
- Spark资源调度机制流程
- spark资源调度流程总结
- Spark资源调度分配原理
- Spark资源调度分配解密
- Spark 资源调度及任务调度
- Spark-0.8新增Fair Scheduler资源调度
- Spark Executor Driver资源调度小结
- Spark schedule资源调度分配详解
- Spark Executor Driver资源调度小结
- Spark资源调度中的通信流程
- Spark Executor Driver资源调度小结
- LeetCode 97. Interleaving String(字符串交织)
- nyoj711最舒适的路线 (枚举 + 并查集 )
- Tomcat Server.xml详解
- 百度之星2016初赛
- C++ 面向对象(三)—— 类之间的关系
- 31 Spark资源调度
- iOS开发(OC)——AFNetworking之文件上传
- Android,View设置margin
- Java正则表达式—小应用—简易爬虫
- Runtime
- Mysql数据库多表联合更新
- C++ 面向对象(四)—— 多态 (Polymorphism)
- pulltorefresh属性简介
- 自定义Android渐变式圆环滑动条