Flume-ng源码解析之Source组件

来源:互联网 发布:深圳蜂窝网络 编辑:程序博客网 时间:2024/04/29 15:21

如果你还没看过Flume-ng源码解析系列中的启动流程、Channel组件和Sink组件,可以点击下面链接:
Flume-ng源码解析之启动流程
Flume-ng源码解析之Channel组件
Flume-ng源码解析之Sink组件

在前面三篇文章中我们初步了解了Flume的启动流程、Channel组件和Sink组件,接下来我们一起来看看agent三大组件中Source组件。

1 Source

Source,作为agent中的消息来源组件,我们来看看它是如何将event传递给channel的和它的特性。

依然先看代码:

@InterfaceAudience.Public@InterfaceStability.Stablepublic interface Source extends LifecycleAware, NamedComponent {  public void setChannelProcessor(ChannelProcessor channelProcessor);  public ChannelProcessor getChannelProcessor();}

我们可以看到它里面定义的两个需要实现方法是getChannelProcessor和setChannelProcessor,我们大概可以猜到,source就是通过ChannelProcessor将event传输给channel的。

这里先来了解一下Source的类型,Flume根据数据来源的特性将Source分成两类类,像Http、netcat和exec等就是属于事件驱动型(EventDrivenSource),而kafka和Jms等就是属于轮询拉取型(PollableSource)。

据我们在启动流程中了解到的,Application是先启动SourceRunner,再由SourceRunner来启动source,那么既然source有两种类型,那么Sourcerunner也分为EventDrivenSourceRunner和PollableSourceRunner,我们来看看它们的start():

EventDrivenSourceRunner

public class EventDrivenSourceRunner extends SourceRunner {  …  @Override  public void start() {    Source source = getSource();    ChannelProcessor cp = source.getChannelProcessor();    cp.initialize();    source.start();    lifecycleState = LifecycleState.START;  }  …}

PollableSourceRunner

public class PollableSourceRunner extends SourceRunner {  …  @Override  public void start() {    PollableSource source = (PollableSource) getSource();    ChannelProcessor cp = source.getChannelProcessor();    cp.initialize();    source.start();    runner = new PollingRunner();    runner.source = source;    runner.counterGroup = counterGroup;    runner.shouldStop = shouldStop;    runnerThread = new Thread(runner);    runnerThread.setName(getClass().getSimpleName() + "-" +         source.getClass().getSimpleName() + "-" + source.getName());    runnerThread.start();    lifecycleState = LifecycleState.START;  }  …  public static class PollingRunner implements Runnable {    private PollableSource source;    private AtomicBoolean shouldStop;    private CounterGroup counterGroup;    @Override    public void run() {      logger.debug("Polling runner starting. Source:{}", source);      while (!shouldStop.get()) {        counterGroup.incrementAndGet("runner.polls");        try {          if (source.process().equals(PollableSource.Status.BACKOFF)) {            counterGroup.incrementAndGet("runner.backoffs");            Thread.sleep(Math.min(                counterGroup.incrementAndGet("runner.backoffs.consecutive")                * source.getBackOffSleepIncrement(), source.getMaxBackOffSleepInterval()));          } else {            counterGroup.set("runner.backoffs.consecutive", 0L);          }        } catch (InterruptedException e) {          logger.info("Source runner interrupted. Exiting");          counterGroup.incrementAndGet("runner.interruptions");        } catch (EventDeliveryException e) {          logger.error("Unable to deliver event. Exception follows.", e);          counterGroup.incrementAndGet("runner.deliveryErrors");        } catch (Exception e) {          counterGroup.incrementAndGet("runner.errors");          logger.error("Unhandled exception, logging and sleeping for " +              source.getMaxBackOffSleepInterval() + "ms", e);          try {            Thread.sleep(source.getMaxBackOffSleepInterval());          } catch (InterruptedException ex) {            Thread.currentThread().interrupt();          }        }      }      logger.debug("Polling runner exiting. Metrics:{}", counterGroup);    }  }}

无论是PollableSourceRunner还是EventDrivenSourceRunner,都是调用它里面的source的start()。这个时候我们看到ChannelProcessor的存在,那么就会有疑惑,这ChannelProcessor哪来的?我们还是得看回AbstarctConfigurationProvider,查看里面的loadSources(),我们就会发现下面这段代码:

ChannelSelectorConfiguration selectorConfig = config.getSelectorConfiguration();ChannelSelector selector = ChannelSelectorFactory.create(sourceChannels, selectorConfig);ChannelProcessor channelProcessor = new ChannelProcessor(selector);Configurables.configure(channelProcessor, config);source.setChannelProcessor(channelProcessor);

到这里我们基本已经了解了Source的启动流程,下面以AvroSource为例看看,source是在哪里调用ChannelProcessor的插入方法。

2 AvroSource

public class AvroSource extends AbstractSource implements EventDrivenSource,    Configurable, AvroSourceProtocol {  …  @Override  public Status append(AvroFlumeEvent avroEvent) {    if (logger.isDebugEnabled()) {      if (LogPrivacyUtil.allowLogRawData()) {        logger.debug("Avro source {}: Received avro event: {}", getName(), avroEvent);      } else {        logger.debug("Avro source {}: Received avro event", getName());      }    }    sourceCounter.incrementAppendReceivedCount();    sourceCounter.incrementEventReceivedCount();    Event event = EventBuilder.withBody(avroEvent.getBody().array(),        toStringMap(avroEvent.getHeaders()));    try {      getChannelProcessor().processEvent(event);    } catch (ChannelException ex) {      logger.warn("Avro source " + getName() + ": Unable to process event. " +          "Exception follows.", ex);      return Status.FAILED;    }    sourceCounter.incrementAppendAcceptedCount();    sourceCounter.incrementEventAcceptedCount();    return Status.OK;  }  @Override  public Status appendBatch(List<AvroFlumeEvent> events) {    logger.debug("Avro source {}: Received avro event batch of {} events.",        getName(), events.size());    sourceCounter.incrementAppendBatchReceivedCount();    sourceCounter.addToEventReceivedCount(events.size());    List<Event> batch = new ArrayList<Event>();    for (AvroFlumeEvent avroEvent : events) {      Event event = EventBuilder.withBody(avroEvent.getBody().array(),          toStringMap(avroEvent.getHeaders()));      batch.add(event);    }    try {      getChannelProcessor().processEventBatch(batch);    } catch (Throwable t) {      logger.error("Avro source " + getName() + ": Unable to process event " +          "batch. Exception follows.", t);      if (t instanceof Error) {        throw (Error) t;      }      return Status.FAILED;    }    sourceCounter.incrementAppendBatchAcceptedCount();    sourceCounter.addToEventAcceptedCount(events.size());    return Status.OK;  }  …}

在append方法中我们可以看到getChannelProcessor().processEvent(event);,所以不同的Source根据它的不同触发机制和拉取机制,在特定的时候调用ChannelProcessor来执行event的插入。 ·

到此为止,我们就完成了对Flume启动流程和三大组件的研究,鉴于能力,其中有些细节没办法深入研究,希望以后有时间能够继续深入分析下去。

0 0