第17课：Spark Streaming资源动态申请和动态控制消费速率原理剖析

来源：互联网发布：从1688复制到淘宝店编辑：程序博客网时间：2024/05/16 05:14

高级特性：

1、Spark Streaming资源动态分配

2、Spark Streaming动态控制消费速率

原理剖析，动态控制消费速率其后面存在一套理论，资源动态分配也有一套理论。

先讲理论，后面讨论。Heron可能替代strom

为什么要动态资源分配和动态控制速率？

Spark默认是先分配资源，然后计算；粗粒度的分配方式，资源提前分配好，有计算任务提前分配好资源；

不好的地方：从Spark Streaming角度讲有高峰值和低峰值，如果资源分配从高峰值、低峰值考虑都有大量资源的浪费。

其实当年Spark Streaming参考了Storm的设计思想，在其基础上构建的SparkStreaming2.0x内核有

很大变化，此框架的最大好处就是和兄弟框架联手。我们考虑Spark Streaming资源分配按高峰值分配的话，就会造成预分配资源浪费，尤其是低峰值造成大量资源浪费。

Spark Streaming本身基于Spark Core的，Spark Core的核心是SparkContext对象，从SparkContext类代码的556行开始，支持资源的动态分配，

源码如下：

SparkContext

// Optionally scalenumber of executors dynamically based on workload. Exposed for testing.
val dynamicAllocationEnabled = Utils.isDynamicAllocationEnabled(_conf)
if (!dynamicAllocationEnabled && _conf.getBoolean("spark.dynamicAllocation.enabled",false)) {
logWarning("Dynamic Allocation and num executors both set, thusdynamic allocation disabled.")
}

_executorAllocationManager=
if (dynamicAllocationEnabled){
Some(new ExecutorAllocationManager(this, listenerBus, _conf))
} else {
None
}
_executorAllocationManager.foreach(_.start())

def isDynamicAllocationEnabled(conf: SparkConf): Boolean = {
conf.getBoolean("spark.dynamicAllocation.enabled",false)&&
conf.getInt("spark.executor.instances",0) == 0
}

ExecutorAllocationManager

// Listener forSpark events that impact the allocation policy
private val listener =new ExecutorAllocationListener

// Executor that handles the scheduling task.
private val executor =
ThreadUtils.newDaemonSingleThreadScheduledExecutor("spark-dynamic-executor-allocation")

// Metric source forExecutorAllocationManager to expose internal status to MetricsSystem.
val executorAllocationManagerSource =new ExecutorAllocationManagerSource

master

case RegisterApplication(description, driver)=> {
// 如是果master的状态是standby,也就是当前这个master是standby Master,不是active Master
//那么Application来请求注册,什么都不会干
if (state== RecoveryState.STANDBY) {
    // ignore, don't send response
} else {
    logInfo("Registering app " + description.name)
    //用applicationDescrioption信息,创建ApplicationInfo
    val app = createApplication(description, driver)
    //注册Application,将ApplicationInfo加入缓存,将Application加入等待高度的队列-waitingApps
    registerApplication(app)
    logInfo("Registered app " + description.name +" with ID " + app.id)
    //使用持久化引擎,将ApplicationInfo进行持久化
    persistenceEngine.addApplication(app)
    //反向,向sparkDeploySchedulerBackend的AppClient的ClientActor发送消息,也就是RegisterApplication
    driver.send(RegisteredApplication(app.id, self))
    schedule()
}
}

/**
* Schedule the currently availableresources among waiting apps. This method will be called
* every time a new app joins or resourceavailability changes.
*/
private def schedule(): Unit = {
if (state!= RecoveryState.ALIVE) {return }   //判断master状态不是alive直接返回. standby master是不会进行application等资源的高度的
// Drivers take strict precedence overexecutors
// Random.shuffle原理,就是对传入的集合的元素进行随机打乱
//取出了workers中所有之前注册上来的worker,进行过滤,必须是状态是alive的worker,调用Random的shuffle方法进行随机的打乱
val shuffledWorkers= Random.shuffle(workers)//Randomization helps balance drivers
for (worker<- shuffledWorkersif worker.state == WorkerState.ALIVE) {
    for (driver<- waitingDrivers) {
      // 如果当前的worker的空闲内存量大于等于driver需要的内存,并且worker的空闲cpu数量,大于等于driver需要的cpu数量
      if (worker.memoryFree >= driver.desc.mem &&worker.coresFree >= driver.desc.cores) {
        launchDriver(worker, driver)
        //将driver从waitingDrivers队列中移除
        waitingDrivers -= driver
      }
    }
}
startExecutorsOnWorkers()
}

ExecutorAllocationManager

// Lower and upperbounds on the number of executors.
private val minNumExecutors = conf.getInt("spark.dynamicAllocation.minExecutors",0)
private val maxNumExecutors = conf.getInt("spark.dynamicAllocation.maxExecutors",
Integer.MAX_VALUE)

// How long there must be backloggedtasks for before an addition is triggered (seconds)
private val schedulerBacklogTimeoutS =conf.getTimeAsSeconds(
"spark.dynamicAllocation.schedulerBacklogTimeout","1s")

// Same as above, but used only after`schedulerBacklogTimeoutS` is exceeded
private val sustainedSchedulerBacklogTimeoutS =conf.getTimeAsSeconds(
"spark.dynamicAllocation.sustainedSchedulerBacklogTimeout",s"${schedulerBacklogTimeoutS}s")

// How long an executor must be idle forbefore it is removed (seconds)
private val executorIdleTimeoutS = conf.getTimeAsSeconds(
"spark.dynamicAllocation.executorIdleTimeout","60s")

private val cachedExecutorIdleTimeoutS = conf.getTimeAsSeconds(
"spark.dynamicAllocation.cachedExecutorIdleTimeout",s"${Integer.MAX_VALUE}s")

// During testing, the methods toactually kill and add executors are mocked out
private val testing = conf.getBoolean("spark.dynamicAllocation.testing",false)

// TODO: The default value of 1 forspark.executor.cores works right now because dynamic
// allocation is only supported for YARN and the default number of cores perexecutor in YARN is
// 1, but it might need to be attained differently for different clustermanagers
private val tasksPerExecutor =
conf.getInt("spark.executor.cores",1) / conf.getInt("spark.task.cpus",1)

validateSettings()

start

/**
* Register for scheduler callbacks todecide when to add and remove executors, and start
* the scheduling task.
*/
def start(): Unit = {
listenerBus.addListener(listener)

val scheduleTask= newRunnable() {
    override def run(): Unit = {
      try {
        schedule()
      } catch {
        case ct:ControlThrowable =>
          throw ct
        case t:Throwable =>
          logWarning(s"Uncaught exception in thread${Thread.currentThread().getName}", t)
      }
    }
}
executor.scheduleAtFixedRate(scheduleTask,0, intervalMillis, TimeUnit.MILLISECONDS)
}

schedule

private def schedule(): Unit = synchronized {
val now =clock.getTimeMillis

  updateAndSyncNumExecutorsTarget(now)

removeTimes.retain { case (executorId, expireTime) =>
    val expired= now >= expireTime
    if (expired){
      initializing = false
      removeExecutor(executorId)
    }
    !expired
}
}

配置

spark.streaming.backpressure.enabled控制流动的速度(流进的速度和计算的时间)，建议设置为true

0 0