ControllerShutdownRequest分析

来源:互联网 发布:笔记本支架哪个好 知乎 编辑:程序博客网 时间:2024/05/21 12:43

前面我们知道,当broker宕机我们可以根据注册BrokerChangeListener来监听zookeeper中"/brokers/ids"子节点变化,处理broker宕机的场景。

但是在有些场景中,用户希望主动关闭正常运行的broker,比如更换硬件,系统升级,修改kafka配置等。如果依然采用上述方式,那么就不太好了,这里Kafka提供了Controlled Shutdown的方式来关闭broker实例,其好处在于:

# 让日志文件可以完全同步在磁盘上,broker下次上线不需要进行Log的恢复操作

# 这种方式在关闭之前,会对其上的leader副本进行迁移,减少分区不可用时间

KafkaController.shutdownBroker是ControlledShutdownRequest核心,使用ControlledShutdownLeadSelector重新选举Leader副本和Isr列表,实现Leader副本的迁移。ControlledShutdownLeadSelector主要就是:从当前ISR集合列表移除正在关闭的副本,把剩余的副本作为新的ISR列表,然后从新的ISR列表选举leader,需要向可用的副本集发送LeaderAndIsrRequest。

defshutdownBroker(id: Int) :Set[TopicAndPartition] = {
  if (!isActive()) {
    throw new ControllerMovedException("Controllermoved to another broker. Aborting controlled shutdown")
  }

  controllerContext.brokerShutdownLocksynchronized {
    info("Shutting down broker "+ id)
    inLock(controllerContext.controllerLock) {
      if (!controllerContext.liveOrShuttingDownBrokerIds.contains(id))
        throw new BrokerNotAvailableException("Brokerid %d does not exist.".format(id))
      controllerContext.shuttingDownBrokerIds.add(id)
      debug("All shutting down brokers: "+ controllerContext.shuttingDownBrokerIds.mkString(","))
      debug("Live brokers: "+ controllerContext.liveBrokerIds.mkString(","))
    }
    // 获取待关闭的broker上的分区和副本信息
   
val allPartitionsAndReplicationFactorOnBroker:Set[(TopicAndPartition, Int)] =
      inLock(controllerContext.controllerLock) {
        controllerContext.partitionsOnBroker(id)
          .map(topicAndPartition=> (topicAndPartition, controllerContext.partitionReplicaAssignment(topicAndPartition).size))
      }
    //  遍历这些分区副本信息
   
allPartitionsAndReplicationFactorOnBroker.foreach{
      case(topicAndPartition,replicationFactor) =>
        inLock(controllerContext.controllerLock) {
          // 遍历每一个分区的leader信息
         
controllerContext
.partitionLeadershipInfo.get(topicAndPartition).foreach{ currLeaderIsrAndControllerEpoch =>
            if (replicationFactor> 1) {// 是否开启副本机制
              //
检测leader是否处于待关闭的broker
             
if (currLeaderIsrAndControllerEpoch.leaderAndIsr.leader== id) {
                // 将分区状态置为OnlinePartition,此步骤主要是使用controlledShutdownPartitionLeaderSelector进行重新
                //
选择leaderISR列表,并将结果写入zookeeper,之后发送LeaderAndIsrRequestUpdateMetadataRequest
               
partitionStateMachine
.handleStateChanges(Set(topicAndPartition),OnlinePartition,
                  controlledShutdownPartitionLeaderSelector)
              } else {
                // 发送StopReplicaRequest(不删除副本)
               
try {
                  brokerRequestBatch.newBatch()
                  brokerRequestBatch.addStopReplicaRequestForBrokers(Seq(id),topicAndPartition.topic,
                    topicAndPartition.partition,deletePartition = false)
                  brokerRequestBatch.sendRequestsToBrokers(epoch)
                } catch {
                  case e: IllegalStateException => {
                    // Resign if the controller is in an illegal state
                   
error("Forcing the controller to resign")
                    brokerRequestBatch.clear()
                    controllerElector.resign()

                    throw e
                 
}
                }
                // 将副本转换成Offline状态
               
replicaStateMachine
.handleStateChanges(Set(PartitionAndReplica(topicAndPartition.topic,
                  topicAndPartition.partition,id)), OfflineReplica)
              }
            }
          }
        }
    }
    def replicatedPartitionsBrokerLeads() =inLock(controllerContext.controllerLock) {
      trace("All leaders = "+ controllerContext.partitionLeadershipInfo.mkString(","))
      controllerContext.partitionLeadershipInfo.filter{
        case (topicAndPartition,leaderIsrAndControllerEpoch) =>
          leaderIsrAndControllerEpoch.leaderAndIsr.leader== id && controllerContext.partitionReplicaAssignment(topicAndPartition).size> 1
     
}.keys
   
}
    replicatedPartitionsBrokerLeads().toSet
 
}
}

 

原创粉丝点击