Java 并发框架 Disruptor 源码分析:RingBuffer

来源:互联网 发布:linux改机器时间命令 编辑:程序博客网 时间:2024/06/14 06:31

  • Java 并发框架 Disruptor 源码分析RingBuffer
    • Disruptor 介绍
    • RingBuffer 介绍
    • RingBuffer 源码分析
      • 初始化
      • 写操作
      • 读操作
      • 总结
    • 参考资料

Java 并发框架 Disruptor 源码分析:RingBuffer

Disruptor 介绍

按照官方文档的说法:Disruptor 是一个高性能的线程间通信库。它来自于 LMAX 对并发、性能和非阻塞算法的研究,如今交易系统基础架构的核心部分。

The LMAX Disruptor is a high performance inter-thread messaging library. It grew out of LMAX’s research into concurrency, performance and non-blocking algorithms and today forms a core part of their Exchange’s infrastructure.

Disruptor 高性能的原因有以下几点:
1. 无锁数据结构 RingBuffer
2. 伪共享 & 缓存行填充

这篇文章里,我们首先介绍一下环形缓冲区 RingBuffer,然后深入源码分析一下 Disruptor 是如何做到无锁操作 RingBuffer的。

RingBuffer 介绍

环形缓冲区(ring buffer),是一种用于表示一个固定尺寸、头尾相连的缓冲区的数据结构,适合缓存数据流。RingBuffer 通常采用数组实现,对 CPU 缓存友好,性能比链表好。

一个圆形缓冲区有四个关键参数:
1. 内存地址。
2. 缓冲区长度。
3. 存储在缓冲区中的有效数据的开始位置:读指针。
4. 存储在缓冲区中的有效数据的结尾位置:写指针。

下面深入源码研究一下 Disruptor 中 RingBuffer 的实现,看看它使如何做到无锁读写的。

RingBuffer 源码分析

在看源码之前,需要先了解一下 Disruptor 是如何使用的:Disruptor 入门。

初始化

我们先来看一下 RingBuffer 类的构造方法:

public final class RingBuffer<E> extends RingBufferFields<E> implements Cursored, EventSequencer<E>, EventSink<E> {    RingBuffer(EventFactory<E> eventFactory, Sequencer sequencer)    {        super(eventFactory, sequencer);    }}abstract class RingBufferFields<E> extends RingBufferPad{    private static final int BUFFER_PAD;    private static final long REF_ARRAY_BASE;    private static final int REF_ELEMENT_SHIFT;    private static final Unsafe UNSAFE = Util.getUnsafe();    static    {        // 获取用户给定数组寻址的换算因子,也就是数组中每个元素引用所占字节数目        final int scale = UNSAFE.arrayIndexScale(Object[].class);        if (4 == scale)        {            REF_ELEMENT_SHIFT = 2;        }        else if (8 == scale)        {            REF_ELEMENT_SHIFT = 3;        }        else        {            throw new IllegalStateException("Unknown pointer size");        }        BUFFER_PAD = 128 / scale;        // Including the buffer pad in the array base offset        REF_ARRAY_BASE = UNSAFE.arrayBaseOffset(Object[].class) + (BUFFER_PAD << REF_ELEMENT_SHIFT);    }    RingBufferFields(EventFactory<E> eventFactory, Sequencer sequencer)    {        this.sequencer = sequencer;        this.bufferSize = sequencer.getBufferSize();        if (bufferSize < 1)        {            throw new IllegalArgumentException("bufferSize must not be less than 1");        }        if (Integer.bitCount(bufferSize) != 1)        {            // bufferSize 必须是 2 的 N 次方            throw new IllegalArgumentException("bufferSize must be a power of 2");        }        this.indexMask = bufferSize - 1;        this.entries = new Object[sequencer.getBufferSize() + 2 * BUFFER_PAD];        fill(eventFactory);    }    // 只对数组中间的 bufferSize 个元素进行初始化    private void fill(EventFactory<E> eventFactory)    {        for (int i = 0; i < bufferSize; i++)        {            entries[BUFFER_PAD + i] = eventFactory.newInstance();        }    }    }   

先简单说明一下构造方法中两个参数:
1. Sequencer:生产者用于访问缓存的控制器,它持有消费者序号的引用;新事件发布后通过 WaitStrategy 通知正在等待的SequenceBarrier。
2. EventFactory:RingBuffer 中存储的元素的初始化工厂类。

从构造方法中我们看到“bufferSize must be a power of 2”,这么要求的目的是方便使用位操作来获取读写元素在内存中的位置,其效率比取余 % 操作高得多。RingBuffer 这一点和 Linux 内核中的 kfifo 是一致的。

申请的数组 entries 实际大小为 bufferSize + 2 * BUFFER_PAD,BUFFER_PAD 个数组元素占用 128 字节,也就是说在数组前后各加了 128 字节的填充,这主要是为了防止伪共享。

写操作

官方文档中写数据的示例如下:

// 自定义的 RingBuffer 中的数据public class LongEvent {    private long value;    public void set(long value) {        this.value = value;    }}public class LongEventProducer {    private final RingBuffer<LongEvent> ringBuffer;    public LongEventProducer(RingBuffer<LongEvent> ringBuffer) {        this.ringBuffer = ringBuffer;    }    public void onData(ByteBuffer bb) {        long sequence = ringBuffer.next();  // 申请下一个写节点序号        try {            LongEvent event = ringBuffer.get(sequence); // 根据序号获取待写入的元素            event.set(bb.getLong(0));  // 写入数据        } finally {            ringBuffer.publish(sequence);   // 提交        }    }}

从代码中可以看出来,RingBuffer 的写操作分为三个步骤:
1. 申请下一个节点。
2. 写入数据。
3. 提交。

申请下一个可写入的节点序号调用的是 RingBuffer 的 next 方法,该方法将事情委托给了 Sequencer 的同名方法。Sequencer 有两个实现:单生产者版本 SingleProducerSequencer、多生产者版本 MultiProducerSequencer。两者的区别是多生产者需要竞争获取下一个写节点,而单生产者版本无此竞争。我们先看看多生产者版本的代码:

public final class RingBuffer<E> extends RingBufferFields<E> implements Cursored, EventSequencer<E>, EventSink<E>{    @Override    public long next()    {        return sequencer.next();    }   }   public final class MultiProducerSequencer extends AbstractSequencer{    @Override    public long next()    {        return next(1);    }    // 允许一次获取多个写节点    @Override    public long next(int n)    {        if (n < 1)        {            throw new IllegalArgumentException("n must be > 0");        }        long current;        long next;        do        {            // cursor 代表当前写指针位置            current = cursor.get();            next = current + n;            long wrapPoint = next - bufferSize;            // cachedGatingSequence 是最慢的消费者(读指针)所处的位置            long cachedGatingSequence = gatingSequenceCache.get();            // 如果空间满则等待            if (wrapPoint > cachedGatingSequence || cachedGatingSequence > current)            {                long gatingSequence = Util.getMinimumSequence(gatingSequences, current);                if (wrapPoint > gatingSequence)                {                    waitStrategy.signalAllWhenBlocking();                    LockSupport.parkNanos(1); // TODO, should we spin based on the wait strategy?                    continue;                }                gatingSequenceCache.set(gatingSequence);            }            // 否则使用 CAS 操作更新 cursor            else if (cursor.compareAndSet(current, next))            {                break;            }        }        while (true);        return next;    }   }   

MultiProducerSequencer 使用 CAS 操作来更新写指针位置,这块是和 SingleProducerSequencer 的主要区别,单生产者模式由于没有写竞争,所以是直接设置的。之所以要特意区分单生产者和多生产者是因为,CAS 操作毕竟还是要损耗一些性能的,在没有竞争的情况下,直接赋值效率更高。

读操作

如果需要消费数据,则需要实现 EventHandler 接口,并将其放入 disruptor 中。

public class LongEventHandler implements EventHandler<LongEvent> {    public void onEvent(LongEvent event, long sequence, boolean endOfBatch) {        System.out.println(Thread.currentThread().getName() + " Event: " + event);    }}Disruptor<LongEvent> disruptor = new Disruptor<>(factory, bufferSize, Executors.newFixedThreadPool(3));// Connect the handlerdisruptor.handleEventsWith(new LongEventHandler());

Disruptor 关联 EventHandler 的代码如下:

public class Disruptor<T>{    public EventHandlerGroup<T> handleEventsWith(final EventHandler<? super T>... handlers)    {        return createEventProcessors(new Sequence[0], handlers);    }    EventHandlerGroup<T> createEventProcessors( final Sequence[] barrierSequences,        final EventHandler<? super T>[] eventHandlers)    {        checkNotStarted();        // Sequence 保存消费者最近读过的数据位置,读过则表示此位置可被生产者写入        final Sequence[] processorSequences = new Sequence[eventHandlers.length];        // 消费者从 SequenceBarrier 获取下一个可消费数据,多组消费者使用同一个 SequenceBarrier        final SequenceBarrier barrier = ringBuffer.newBarrier(barrierSequences);        // 这里多个 eventHandler 表示多组消费者,同一份数据会交给所有 eventHandler 处理        for (int i = 0, eventHandlersLength = eventHandlers.length; i < eventHandlersLength; i++)        {            final EventHandler<? super T> eventHandler = eventHandlers[i];            final BatchEventProcessor<T> batchEventProcessor =                new BatchEventProcessor<T>(ringBuffer, barrier, eventHandler);            if (exceptionHandler != null)            {                batchEventProcessor.setExceptionHandler(exceptionHandler);            }            consumerRepository.add(batchEventProcessor, eventHandler, barrier);            processorSequences[i] = batchEventProcessor.getSequence();        }        updateGatingSequencesForNextInChain(barrierSequences, processorSequences);        return new EventHandlerGroup<T>(this, consumerRepository, processorSequences);    }       }

真正负责轮询处理数据的是 BatchEventProcessor 类,大致步骤如下:
1. 获取可读数据序号。
2. 挨个处理数据。
3. 更新已读数据位置。

public final class BatchEventProcessor<T> implements EventProcessor {    @Override    public void run() {        if (!running.compareAndSet(false, true)) {            throw new IllegalStateException("Thread is already running");        }        sequenceBarrier.clearAlert();        notifyStart();        T event = null;        long nextSequence = sequence.get() + 1L;        try {            while (true) {                try {                    // 获取下一批可读的数据                    final long availableSequence = sequenceBarrier.waitFor(nextSequence);                    if (batchStartAware != null) {                        batchStartAware.onBatchStart(availableSequence - nextSequence + 1);                    }                    // 挨个处理                    while (nextSequence <= availableSequence) {                        // 根据序号,获取数据                        event = dataProvider.get(nextSequence);                        // 调用 eventHandler 处理数据                        eventHandler.onEvent(event, nextSequence, nextSequence == availableSequence);                        nextSequence++;                    }                    // 更新消费完成的数据位置                    sequence.set(availableSequence);                } catch (final TimeoutException e) {                    notifyTimeout(sequence.get());                } catch (final AlertException ex) {                    if (!running.get()) {                        break;                    }                } catch (final Throwable ex) {                    exceptionHandler.handleEventException(ex, nextSequence, event);                    sequence.set(nextSequence);                    nextSequence++;                }            }        } finally {            notifyShutdown();            running.set(false);        }    }}

这里的关键是获取可读数据序号,我们深入看一下 ProcessingSequenceBarrier 的 waitFor 方法:

final class ProcessingSequenceBarrier implements SequenceBarrier{    @Override    public long waitFor(final long sequence)        throws AlertException, InterruptedException, TimeoutException    {        checkAlert();        // waitStrategy 默认采用的 BlockingWaitStrategy        long availableSequence = waitStrategy.waitFor(sequence, cursorSequence, dependentSequence, this);        if (availableSequence < sequence)        {            return availableSequence;        }        return sequencer.getHighestPublishedSequence(sequence, availableSequence);    }   }   

该方法主要调用了 waitStrategy 的 waitFor 方法,以默认的 waitStrategy 为例看看代码:

public final class BlockingWaitStrategy implements WaitStrategy{    private final Lock lock = new ReentrantLock();    private final Condition processorNotifyCondition = lock.newCondition();    @Override    public long waitFor(long sequence, Sequence cursorSequence, Sequence dependentSequence, SequenceBarrier barrier)        throws AlertException, InterruptedException    {        long availableSequence;        // cursorSequence 相当于写指针,sequence 相当于读指针,前者小于后者,表示 RingBuffer 空,消费者需要等待        if (cursorSequence.get() < sequence)        {            lock.lock();            try            {                while (cursorSequence.get() < sequence)                {                    barrier.checkAlert();                    processorNotifyCondition.await();                }            }            finally            {                lock.unlock();            }        }        // 当消费者之间没有依赖关系的时候,dependentSequence 就是 cursorSequence        // 存在依赖关系的时候,dependentSequence 里存放的是一组依赖的 Sequence,get 方法得到的是消费最慢的依赖的位置        while ((availableSequence = dependentSequence.get()) < sequence)        {            barrier.checkAlert();        }        return availableSequence;    }}    

BatchEventProcessor 适用的是一组消费者里只有一个消费者的情况,那么当同一组消费者中有多个消费者时怎么办呢?使用的是 WorkerPool,一个 WorkerPool 包含多个 WorkProcessor 消费者,WorkProcessor 负责轮询消费数据。对应的 Disruptor 创建消费者组方法如下:

public class Disruptor<T>{    public EventHandlerGroup<T> handleEventsWithWorkerPool(final WorkHandler<T>... workHandlers)    {        return createWorkerPool(new Sequence[0], workHandlers);    }    EventHandlerGroup<T> createWorkerPool(        final Sequence[] barrierSequences, final WorkHandler<? super T>[] workHandlers)    {        final SequenceBarrier sequenceBarrier = ringBuffer.newBarrier(barrierSequences);        final WorkerPool<T> workerPool = new WorkerPool<T>(ringBuffer, sequenceBarrier, exceptionHandler, workHandlers);        consumerRepository.add(workerPool, sequenceBarrier);        final Sequence[] workerSequences = workerPool.getWorkerSequences();        updateGatingSequencesForNextInChain(barrierSequences, workerSequences);        return new EventHandlerGroup<T>(this, consumerRepository, workerSequences);    }       }public final class WorkerPool<T>{    private final AtomicBoolean started = new AtomicBoolean(false);    private final Sequence workSequence = new Sequence(Sequencer.INITIAL_CURSOR_VALUE);    private final RingBuffer<T> ringBuffer;    // WorkProcessors are created to wrap each of the provided WorkHandlers    private final WorkProcessor<?>[] workProcessors;    public WorkerPool(        final RingBuffer<T> ringBuffer,        final SequenceBarrier sequenceBarrier,        final ExceptionHandler<? super T> exceptionHandler,        final WorkHandler<? super T>... workHandlers)    {        this.ringBuffer = ringBuffer;        final int numWorkers = workHandlers.length;        workProcessors = new WorkProcessor[numWorkers];        for (int i = 0; i < numWorkers; i++)        {            workProcessors[i] = new WorkProcessor<T>(                ringBuffer,                sequenceBarrier,                workHandlers[i],                exceptionHandler,                workSequence);        }    }}    

我们再看一下负责轮询处理数据的 WorkProcessor 类:

public final class WorkProcessor<T>    implements EventProcessor{    private final AtomicBoolean running = new AtomicBoolean(false);    private final Sequence sequence = new Sequence(Sequencer.INITIAL_CURSOR_VALUE);    private final RingBuffer<T> ringBuffer;    private final SequenceBarrier sequenceBarrier;    private final WorkHandler<? super T> workHandler;    private final ExceptionHandler<? super T> exceptionHandler;    private final Sequence workSequence;    private final TimeoutHandler timeoutHandler;    public WorkProcessor(        final RingBuffer<T> ringBuffer,        final SequenceBarrier sequenceBarrier,        final WorkHandler<? super T> workHandler,        final ExceptionHandler<? super T> exceptionHandler,        final Sequence workSequence)    {        this.ringBuffer = ringBuffer;        this.sequenceBarrier = sequenceBarrier;        this.workHandler = workHandler;        this.exceptionHandler = exceptionHandler;        this.workSequence = workSequence;        if (this.workHandler instanceof EventReleaseAware)        {            ((EventReleaseAware) this.workHandler).setEventReleaser(eventReleaser);        }        timeoutHandler = (workHandler instanceof TimeoutHandler) ? (TimeoutHandler) workHandler : null;    }       @Override    public void run()    {        // 一个 Processor 只能由一个线程运行        if (!running.compareAndSet(false, true))        {            throw new IllegalStateException("Thread is already running");        }        sequenceBarrier.clearAlert();        notifyStart();        boolean processedSequence = true;        long cachedAvailableSequence = Long.MIN_VALUE;        long nextSequence = sequence.get();        T event = null;        while (true)        {            try            {                if (processedSequence)                {                    processedSequence = false;                    do                    {                        nextSequence = workSequence.get() + 1L;                        sequence.set(nextSequence - 1L);                    }                    // 一组消费者共享同一个 workSequence,使用 CAS 竞争获取可读数据序号                    while (!workSequence.compareAndSet(nextSequence - 1L, nextSequence));                }                // 可读数据序号 cachedAvailableSequence 大于等于 nextSequence 时,处理一个数据                if (cachedAvailableSequence >= nextSequence)                {                    event = ringBuffer.get(nextSequence);                    workHandler.onEvent(event);                    processedSequence = true;                }                else                {                    // 获取可读数据                    cachedAvailableSequence = sequenceBarrier.waitFor(nextSequence);                }            }            catch (final TimeoutException e)            {                notifyTimeout(sequence.get());            }            catch (final AlertException ex)            {                if (!running.get())                {                    break;                }            }            catch (final Throwable ex)            {                // handle, mark as processed, unless the exception handler threw an exception                exceptionHandler.handleEventException(ex, nextSequence, event);                processedSequence = true;            }        }        notifyShutdown();        running.set(false);    }   }   

和单消费者的 BatchEventProcessor 不同的是:
1. 除了要向 sequenceBarrier 申请可读数据序号之外,同组消费者之间保证互斥访问(通过 workSequence 保证)。
2. BatchEventProcessor 中申请一次可以处理一批数据,而这里一次只能处理一个数据。

总结

在生产者端担任写指针角色的是 Sequencer 对象,在消费者端担任读指针角色的是 Sequence 对象,SequenceBarrier 用来在消费者之间以及消费者和RingBuffer之间建立依赖关系:根据生产者的写指针、所依赖的其他消费者的读指针来计算下一个可消费数据的位置。

在多生产者中负责确保线程安全的是 MultiProducerSequencer,多消费者中确保线程安全的是 WorkProcessor,对读写节点的竞争都采用 CAS 操作,效率比重量级锁高。

不同的 WaitStrategy 决定了当 RingBuffer 空或者满时,消费者和生产者的等待策略。

生产者和消费者端都特别针对无竞争、有竞争做了区分:SingleProducerSequencer 和 MultiProducerSequencer、BatchEventProcessor 和 WorkProcessor。这主要是为了优化无竞争的情况,有竞争的时候使用 CAS ,无竞争的时候连 CAS 都不需要,性能更高

参考资料

  1. Disruptor 入门官方文档
  2. Disruptor 入门官方文档中文版
  3. 并发框架 Disruptor 译文
  4. Wiki: 环形缓冲区
  5. Disruptor 使用指南
  6. Disruptor 3.0 的实现细节:含有很多类图
原创粉丝点击