day30:Master的注册机制和状态管理解密

来源:互联网 发布:实体零售业大数据 编辑:程序博客网 时间:2024/05/23 22:23

以下内容整理来源于DT大数据梦工厂:新浪微博:www.weibo.com/ilovepains/

1、Master接受Driver注册

2、master接受Application注册

3、master接受work注册内幕

4、Master处理Driver状态变化

5、Master处理Exectror状态变化

一、MAster对其他组件注册的处理

1、master接受这册的对象主要是:Driver,Application,Application,worker,Executor不会注册给Master, Executor是注册给Driver中的SchedulerBackend的

2、worker 是在启动后主动向Master注册的,所以如果在生产环境下加入新的worker到已经正在运行的spark集群上,此时不需要重新启动spark集群就能够使用新加入的worker以提升处理能力

 private val addressToWorker = new HashMap[RpcAddress, WorkerInfo]

case RegisterWorker(        id, workerHost, workerPort, workerRef, cores, memory, workerUiPort, publicAddress) => {      logInfo("Registering worker %s:%d with %d cores, %s RAM".format(        workerHost, workerPort, cores, Utils.megabytesToString(memory)))      if (state == RecoveryState.STANDBY) {        context.reply(MasterInStandby)      } else if (idToWorker.contains(id)) {        context.reply(RegisterWorkerFailed("Duplicate worker ID"))      } else {        val worker = new WorkerInfo(id, workerHost, workerPort, cores, memory,          workerRef, workerUiPort, publicAddress)        if (registerWorker(worker)) {          persistenceEngine.addWorker(worker)          context.reply(RegisteredWorker(self, masterWebUiUrl))          schedule()        } else {          val workerAddress = worker.endpoint.address          logWarning("Worker registration failed. Attempted to re-register worker at same " +            "address: " + workerAddress)          context.reply(RegisterWorkerFailed("Attempted to re-register worker at same address: "            + workerAddress))        }      }    }
3、Master在接受处理到workwe后会先判断一下当前节点的Master是否是standby的模式,如果是的话就不处理,然后会判断当前master的内存数据idWorker
是否已经有该worker的注册,如果有的话就不会重新注册。

private def registerWorker(worker: WorkerInfo): Boolean = {    // There may be one or more refs to dead workers on this same node (w/ different ID's),    // remove them.    workers.filter { w =>      (w.host == worker.host && w.port == worker.port) && (w.state == WorkerState.DEAD)    }.foreach { w =>      workers -= w    }    val workerAddress = worker.endpoint.address    if (addressToWorker.contains(workerAddress)) {      val oldWorker = addressToWorker(workerAddress)      if (oldWorker.state == WorkerState.UNKNOWN) {        // A worker registering from UNKNOWN implies that the worker was restarted during recovery.        // The old worker must thus be dead, so we will remove it and accept the new worker.        removeWorker(oldWorker)      } else {        logInfo("Attempted to re-register worker at same address: " + workerAddress)        return false      }    }


4、Master如果决定接受注册的work,首先会创建WorkInfo对象来保存注册的work信息:

然后调用RegisterWorker来执行具体的注册过程,如果work的状态是为Dead直接过滤掉,对于UNKNOWN装的内容调用removeWorker进行清理(包括清理该worker下的Executor和Drivers)

5、注册的时候是先注册Driver然后注册



DT大数据梦工厂联系方式:
微信公众号:DT_Spark
博客:http://.blog.sina.com.cn/ilovepains
TEL:18610086859
Email:18610086859@vip.126.com

master--Worker----Thread --RegisterWorker-- workInfo


0 0
原创粉丝点击