kafka的线程模型之二
来源:互联网 发布:淘宝复制链接打不开啊 编辑:程序博客网 时间:2024/06/06 03:24
上一篇文章介绍了四种kafka的线程.
acceptor线程,负责接收新的tcp连接,并交给network线程.
network线程,负责与客户端或者其他broker的网络通行.
硬盘I/O线程.负责将producer或者consumer的数据,写入读出磁盘.
scheduler线程,定时负责flush磁盘,合并数据,更新index文件.
这篇文章将介介绍, ExpirationReaper 线程以及与之配套的Excecuter-fetch线程.
首先kafka中的kafka.server下可以看到,DelayedFetch,DelayedProduce,DelayedCreateTopics以及DelayedDeleteTopics这四个类,
这些类存在的目的是为了什么? 原因是kafka中有一些操作是无法也不需要同步返回的,需要实现超时返回失败的机制,比如如下的例子:
DelayedFetch 用过kafka-client的同学一定知道,自己在consume某一个topic的时候会设置batch-size.kafka尽量使消息能够批量的传递,在消费某一topic时,producer只要产生任意一条数据,就返回给订阅的consumer显然是不合适的也是低效的. 所以kafka返回数据给consumer会满足其中两个条件一下的一个1.累计到一定的消息大小或者条数.2.fetchRequest请求超过一定的时间阈值没有回应. 来保证consumer的高效.
DelayedProduce,用过kafka-client的同学一定知道.自己在produce某一个topic的时候会设置是否需要ack,分为不需要ack,在partition的leader上存下就返回ack或者在所有的ISR中存下才返回ack这几种情况.在最后一种所有ISR都存下Ack的要求下,broker无法确定在什么时候ISR中的broker会存下消息甚至不能保证ISR可以百分之百存下消息.所以kafka返回Ack给producer就会满足其中两个条件的一个1.所有的ISR都存下了数据.2.有的ISR超过一定时间都没能成功存下数据.
DelayedCreateTopics,kafka会根据规则,安排不同的leader给不同的partition,但是kafka集群的leader无法保证所有的定为partition leader的broker都能按照自己的要求成为leader.
DelayedDeleteTopic.道理同上,集群的leader无法保证follower中的partition leader能成功地删除数据.
就上面几种情况,我们可以知道.kafka需要一种高效的定时机制来完成这些定时任务,因为fetch与produce这两个请求可以说是kafka中最频繁的请求了,要是用朴素的方法比如一个请求一个线程再sleep(timeout)那么系统的资源会被很快用完的.针对这个问题kafka的解决方案便是:时间轮.
kafka的时间轮其实是很有意思的,因为我之前看过netty的时间轮源码,netty的时间轮写的很朴素,看到kafka的时间轮时有一种很惊艳的感觉.但是在这边我不会详细介绍,之后会专门写文章对比两者的时间轮有何异同的,在这里仅仅做简单的介绍.
kafka的时间轮是使用java中的delayQueue作为辅助的,可以理解为n个ExpirationReaper线程,阻塞在delayQueue的poll上,
"ExpirationReaper-12" #53 prio=5 os_prio=0 tid=0x00007fadb4d30000 nid=0x11a7e waiting on condition [0x00007fac1ebee000] java.lang.Thread.State: TIMED_WAITING (parking)at sun.misc.Unsafe.park(Native Method)- parking to wait for <0x00000000c88a5e70> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078)at java.util.concurrent.DelayQueue.poll(DelayQueue.java:259)at kafka.utils.timer.SystemTimer.advanceClock(Timer.scala:106)at kafka.server.DelayedOperationPurgatory.advanceClock(DelayedOperation.scala:350)at kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper.doWork(DelayedOperation.scala:374)at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)每次delayQueue会把到时间的任务返回给某一线程,这个苏醒的delayQueue负责
1.推进时间轮的刻度
2.判断这一任务有没有被取消,如果没有被取消,那么把ta交给Executor-Fetch线程处理
而Executor-Fetch是一个大小为1的ExecutorPool
Executors.newFixedThreadPool(1, new ThreadFactory() { def newThread(runnable: Runnable): Thread = Utils.newThread("executor-"+executorName, runnable, false)})Executor-Fetch会阻塞在这个threadPool的linkedBlockingQueue上,如下图所示.
"executor-Fetch" #66 prio=5 os_prio=0 tid=0x00007fabe4002000 nid=0x12144 waiting on condition [0x00007fac1dae1000] java.lang.Thread.State: WAITING (parking)at sun.misc.Unsafe.park(Native Method)- parking to wait for <0x00000000c8341680> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1067)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)at java.lang.Thread.run(Thread.java:745)那么这个Exector-Fetch线程负责什么?这得从上文提到的四个类的共同抽象基类说起
abstract class DelayedOperation(override val delayMs: Long) extends TimerTask with Logging { private val completed = new AtomicBoolean(false) /* * Force completing the delayed operation, if not already completed. * This function can be triggered when * * 1. The operation has been verified to be completable inside tryComplete() * 2. The operation has expired and hence needs to be completed right now * * Return true iff the operation is completed by the caller: note that * concurrent threads can try to complete the same operation, but only * the first thread will succeed in completing the operation and return * true, others will still return false */ def forceComplete(): Boolean = { if (completed.compareAndSet(false, true)) { // cancel the timeout timer cancel() onComplete() true } else { false } } /** * Check if the delayed operation is already completed */ def isCompleted: Boolean = completed.get() /** * Call-back to execute when a delayed operation gets expired and hence forced to complete. */ def onExpiration(): Unit /** * Process for completing an operation; This function needs to be defined * in subclasses and will be called exactly once in forceComplete() */ def onComplete(): Unit /** * Try to complete the delayed operation by first checking if the operation * can be completed by now. If yes execute the completion logic by calling * forceComplete() and return true iff forceComplete returns true; otherwise return false * * This function needs to be defined in subclasses */ def tryComplete(): Boolean /** * Thread-safe variant of tryComplete(). This can be overridden if the operation provides its * own synchronization. */ def safeTryComplete(): Boolean = { synchronized { tryComplete() } } /* * run() method defines a task that is executed on timeout */ override def run(): Unit = { if (forceComplete()) onExpiration() }}根据注释可以得知,子类只需要继承这些父类的方法实现相应的逻辑即可.这些逻辑可以概括为: 填充Response的数据,并将response放回到network线程的队列里去,发送的任务就交由network Thread即可.
举一个最典型的例子,DelayFecth,看看他的tryComplete逻辑是什么样的?
override def tryComplete() : Boolean = { var accumulatedSize = 0 var accumulatedThrottledSize = 0 fetchMetadata.fetchPartitionStatus.foreach { case (topicAndPartition, fetchStatus) => val fetchOffset = fetchStatus.startOffsetMetadata try { if (fetchOffset != LogOffsetMetadata.UnknownOffsetMetadata) { val replica = replicaManager.getLeaderReplicaIfLocal(topicAndPartition.topic, topicAndPartition.partition) val endOffset = if (fetchMetadata.fetchOnlyCommitted) replica.highWatermark else replica.logEndOffset // Go directly to the check for Case D if the message offsets are the same. If the log segment // has just rolled, then the high watermark offset will remain the same but be on the old segment, // which would incorrectly be seen as an instance of Case C. if (endOffset.messageOffset != fetchOffset.messageOffset) { if (endOffset.onOlderSegment(fetchOffset)) { // Case C, this can happen when the new fetch operation is on a truncated leader debug("Satisfying fetch %s since it is fetching later segments of partition %s.".format(fetchMetadata, topicAndPartition)) return forceComplete() } else if (fetchOffset.onOlderSegment(endOffset)) { // Case C, this can happen when the fetch operation is falling behind the current segment // or the partition has just rolled a new segment debug("Satisfying fetch %s immediately since it is fetching older segments.".format(fetchMetadata)) // We will not force complete the fetch request if a replica should be throttled. if (!replicaManager.shouldLeaderThrottle(quota, topicAndPartition, fetchMetadata.replicaId)) return forceComplete() } else if (fetchOffset.messageOffset < endOffset.messageOffset) { // we take the partition fetch size as upper bound when accumulating the bytes (skip if a throttled partition) val bytesAvailable = math.min(endOffset.positionDiff(fetchOffset), fetchStatus.fetchInfo.fetchSize) if (quota.isThrottled(topicAndPartition)) accumulatedThrottledSize += bytesAvailable else accumulatedSize += bytesAvailable } } }
其实结束的逻辑很简单,四种情况只要满足任意一种就行了.
1.broker不再是这个partition的leader,这个情况很少发生.
2.broker也没有consumer所请求的数据
3,consumer所请求的offset并不是最新的offset.这也很好理解,delay fetch的目标是为了可以批量传输提高效率,而当consumer不请求最新的数据时,可以把过往的数据批量返回,而不用等待producer发送数据到某一阈值.
4.累计的消息数达到某一事先设定的阈值.
- kafka的线程模型之二
- kafka的线程模型之一
- kafka线程模型之三 QuotaManager
- COM线程模型之二[译]
- 多线程编程之二:线程模型
- kafka源码解析之二kafka内部的专业术语
- 贫农的大数据之二(Kafka)
- kafka学习之二
- COM线程模型(二)
- CUDA线程模型二
- Kafka消费者生产者编程模型(二)
- [并发并行]_[线程模型]_[Pthread线程使用模型之二 工作组work crew]
- Kafka的消费编程模型
- 朴素的UNIX之-进程/线程模型
- kafka之二 文件存储
- Docker下的Kafka学习之二:搭建集群环境
- Docker下的Kafka学习之二:搭建集群环境
- Netty线程模型之服务端线程模型
- Spring+Quartz定时器结合调用service
- C语言__指针
- Java和C++的区别!
- 线性代数(二)
- Linux C 获取格式化的日期时间
- kafka的线程模型之二
- 八种数据类型
- Evaluate Division(LeetCode 339)
- chapter 9
- java面向对象
- oracle默认值添加方法
- Android 中.aar文件生成方法与用法
- 冒泡
- 最小生成树