DelayedFetch分析

来源:互联网 发布:网侠安卓电视直播软件 编辑:程序博客网 时间:2024/06/05 18:02

DelayedFetch是FecthRequest对应的延迟操作,和DelayedProduce类似,他的流程是:来自消费者或者其他follower replica的请求到来,会交给KafkaApi的handleFetchRequest处理,然后他会调用Replica

Manager的fetchMessage方法对应Log中读取消息,并生成Delayed

Fetch添加到delayedFetchPurgatory中处理

我们先看一下ReplicaManager#fetchMessage的逻辑:

deffetchMessages(timeout: Long,replicaId: Int, fetchMinBytes: Int, fetchMaxBytes: Int,hardMaxBytesLimit: Boolean,
  fetchInfos: Seq[(TopicAndPartition,PartitionFetchInfo)], quota: ReplicaQuota = UnboundedQuota,
  responseCallback: Seq[(TopicAndPartition,FetchResponsePartitionData)] => Unit) {
  val isFromFollower= replicaId >= 0
 
val fetchOnlyFromLeader: Boolean =replicaId != Request.DebuggingConsumerId
 
val fetchOnlyCommitted: Boolean = !Request.isValidBrokerId(replicaId)

  // 从本地日志读取文件
 
val logReadResults= readFromLocalLog(
    replicaId = replicaId,
    fetchOnlyFromLeader= fetchOnlyFromLeader,
    readOnlyCommitted= fetchOnlyCommitted,
    fetchMaxBytes= fetchMaxBytes,
    hardMaxBytesLimit= hardMaxBytesLimit,
    readPartitionInfo= fetchInfos,
    quota = quota)

  // 如果fetch请求来自follower,则更新它的LOE
 
if(Request.isValidBrokerId(replicaId))
    /*
     *
主要逻辑:
     * 1
leader中维护了follower副本各个状态,这里会更新对应follower的状态比如LEO
     * 2
检测是否需要对ISR进行扩张,如果ISR发生变化,则将ISR集合变化记录保存到zookeeper
     * 3
检测是否后移HighWatermark
     * 4
检测delayedProducePurgatory中相关key对应的DelayedProduce,如果满足则执行完成
     */
   
updateFollowerLogReadResults(replicaId,logReadResults)

  // 获取从日志读取到的总字节数
 
val logReadResultValues= logReadResults.map{ case (_, v) => v }
  // 统计读取到的总字节数
 
val bytesReadable= logReadResultValues.map(_.info.messageSet.sizeInBytes).sum
 
// 检查读取结果是否有错误
 
val errorReadingData= logReadResultValues.foldLeft(false) ((errorIncurred,readResult) =>
    errorIncurred|| (readResult.errorCode!= Errors.NONE.code))

  /*
   *
判断是否能够立即返回FetchResponse
   * 1
不想等待,需要立即返回的
   * 2 FetchRequest
没有指定要读取的分区
   * 3
数据已经够了
   * 4
读取数据时候发生了错误,即检查errorReadingData
   */
 
if (timeout<= 0 || fetchInfos.isEmpty|| bytesReadable >= fetchMinBytes || errorReadingData) {
    val fetchPartitionData= logReadResults.map{ case (tp,result) =>
      tp -> FetchResponsePartitionData(result.errorCode,result.hw,result.info.messageSet)
    }
    // 直接调用回调函数
   
responseCallback(fetchPartitionData)
  } else {
    // 封装返回结果
   
val fetchPartitionStatus= logReadResults.map{ case (topicAndPartition,result) =>
      val fetchInfo= fetchInfos.collectFirst{
        case (tp,v) if tp== topicAndPartition => v
     
}.getOrElse(sys.error(s"Partition$topicAndPartition not found in fetchInfos"))
      (topicAndPartition, FetchPartitionStatus(result.info.fetchOffsetMetadata,fetchInfo))
    }
    // 构造FetchMetadata对象
   
val fetchMetadata= FetchMetadata(fetchMinBytes,fetchMaxBytes, hardMaxBytesLimit, fetchOnlyFromLeader,
      fetchOnlyCommitted, isFromFollower, replicaId, fetchPartitionStatus)
    // 构造一个DelayedFetdch对象
   
val delayedFetch= new DelayedFetch(timeout,fetchMetadata, this,quota, responseCallback)

    // 创建一个(topic, partition)键值对对形式的列表作为delayed fetch操作的key
   
val delayedFetchKeys= fetchPartitionStatus.map{ case (tp, _) =>new TopicPartitionOperationKey(tp) }
    // 尝试立即完成当前的请求,否则放入purgatory
   
delayedFetchPurgatory
.tryCompleteElseWatch(delayedFetch,delayedFetchKeys)
  }
}

一 DelayedProduce 和 DelayedFetch之间的关系

在处理ProdcueRequest的过程中向Log添加数据,可能会后移leader的log end offset,follower副本就可以读取到足量的数据,所以会尝试DelayedFetch;

在处理来自follower或者消费者FetchRequest可能会后移HW,所以会尝试完成DelayedProduce

 

二 核心字段

delayMs: 延迟操作的时长

fetchMetadata: FetchMetadata 为FetchRequest中所有相关分区记录相关状态,主要用于判断DelayedProduce是否满足执行条件

responseCallback: 任务满足条件或到期执行在DelayedFetch#onCom

plete调用的回调函数,其主要功能是创建FetchResponse并添加到RequestChannel的responseQueue队列

 

三 重要方法

DelayedFetch的tryComplete方法主要负责检测DelayedFetch的执行条件,并在满足DelayedFetch的执行条件:

# 发生leader副本迁移,该节点不再是leader副本

# 当前节点找不到需要读取数据的分区

# 开始读取的offset不在新的activeSegment中,此时可能发生了log截断或者日志滚动产生了新的LogSegment

# 累计读取的字节数超过最小字节数

override def tryComplete() : Boolean = {  var accumulatedSize = 0 // 累计的字节数  var accumulatedThrottledSize = // 遍历FetchMetadata中所有的Partition状态  fetchMetadata.fetchPartitionStatus.foreach {    case (topicAndPartition, fetchStatus) =>      // 获取之前读取log的结束位置      val fetchOffset = fetchStatus.startOffsetMetadata      try {        if (fetchOffset != LogOffsetMetadata.UnknownOffsetMetadata) {          // 查找partition的分区副本          val replica = replicaManager.getLeaderReplicaIfLocal(topicAndPartition.topic, topicAndPartition.partition)          // 根据FetchRequest的请求来源不同设置endOffset(读取日志结束位置)          // 消费者对应endOffsetHW,而生产者对应的endOffsetLOE          val endOffset =            if (fetchMetadata.fetchOnlyCommitted)              replica.highWatermark            else              replica.logEndOffset          // 检查之前读取的endOffset是否发生变化,如果没变之前读不到足够数据现在还是读不到,任务条件依然不满足;如果变了则继续下面的检查看是否          // 真正满足任务执行条件          if (endOffset.messageOffset != fetchOffset.messageOffset) {            if (endOffset.onOlderSegment(fetchOffset)) {              // endOffset出现减小的情况,跑到了baseOffset较小的Segment上,可能是leader副本日志出现了截断              debug("Satisfying fetch %s since it is fetching later segments of partition %s.".format(fetchMetadata, topicAndPartition))              return forceComplete()            } else if (fetchOffset.onOlderSegment(endOffset)) {              // fetchOffset在新的endOffset之前,但是产生了新的activeSegmentfetchOffset在较旧的LogSegmnet,但是endOffset在新的LogSegment              debug("Satisfying fetch %s immediately since it is fetching older segments.".format(fetchMetadata))              // We will not force complete the fetch request if a replica should be throttled.              if (!replicaManager.shouldLeaderThrottle(quota, topicAndPartition, fetchMetadata.replicaId))                return forceComplete()            } else if (fetchOffset.messageOffset < endOffset.messageOffset) {              // endOffsetfetchOffset在同一个LogSegmentendOffset向后移动,那就尝试计算累计的字节数              val bytesAvailable = math.min(endOffset.positionDiff(fetchOffset), fetchStatus.fetchInfo.fetchSize)              if (quota.isThrottled(topicAndPartition))                accumulatedThrottledSize += bytesAvailable              else                accumulatedSize += bytesAvailable // 累加字节数            }          }        }      } catch {        case utpe: UnknownTopicOrPartitionException => // Case B          debug("Broker no longer know of %s, satisfy %s immediately".format(topicAndPartition, fetchMetadata))          return forceComplete()        case nle: NotLeaderForPartitionException =>  // Case A          debug("Broker is no longer the leader of %s, satisfy %s immediately".format(topicAndPartition, fetchMetadata))          return forceComplete()      }  }  // 累计的字节数足够,调用forceComplete方法  if (accumulatedSize >= fetchMetadata.fetchMinBytes    || ((accumulatedSize + accumulatedThrottledSize) >= fetchMetadata.fetchMinBytes && !quota.isQuotaExceeded()))    forceComplete()  else    false}

 

原创粉丝点击