使用flume问题总结3——一个使用flume拦截器和选择器的简单实例

来源:互联网 发布:通联数据股份公司待遇 编辑:程序博客网 时间:2024/05/21 09:22

flume配置文件示例:

//flume source选用syslog TCP sourceproducer.sources = syslogSource//For each one of the sources, the type is definedproducer.sources.syslogSource.type = syslogtcpproducer.sources.syslogSource.bind = localhostproducer.sources.syslogSource.port = 5496//官方文档中介绍选择器时未指明仍然需要配置这一项,但不配置时报错producer.sources.syslogSource.channels = cbaidu csina cgoogle //类型为正则表达式提取拦截器(regex_extractor),//即从event正文中提取信息加至header,配合下面的channel selector就可以进行event路由了//the interceptors:domain nameproducer.sources.syslogSource.interceptors=domainnameproducer.sources.syslogSource.interceptors.domainname.type=regex_extractorproducer.sources.syslogSource.interceptors.domainname.regex = YM:(\\w+)//从正文中提取域名,作为一个新的header//key为domain_name,value为为对应的域名信息,添加至event//only add one headerproducer.sources.syslogSource.interceptors.domainname.serializers = s1 producer.sources.syslogSource.interceptors.domainname.serializers.s1.name = domain_nameproducer.sources.syslogSource.interceptors.domainname.serializers.s2.type = default//类型:多路复用选择器multiplexing,即依据指定的header值,将event路由至不同的channel//selectorproducer.sources.syslogSource.selector.type = multiplexingproducer.sources.syslogSource.selector.header = domain_name//哪一个headerproducer.sources.syslogSource.selector.mapping.baidu = cbaidu//不同的header值路由至不同的channelproducer.sources.syslogSource.selector.mapping.sina = csinaproducer.sources.syslogSource.selector.mapping.google = cgoogleproducer.sources.syslogSource.selector.default = cbaidu// Each channel's type is defined.//3个channel均是内存producer.channels = cbaidu csina cgoogleproducer.channels.cbaidu.type = memoryproducer.channels.csina.type = memoryproducer.channels.cgoogle.type = memoryproducer.channels.cbaidu.capacity = 1000producer.channels.csina.capacity = 1000producer.channels.cgoogle.capacity = 1000// Each sink's type must be defined//3个不同的sink,其中两个为kafka,一个是File Rollproducer.sinks = sbaidu ssina sgoogleproducer.sinks.sbaidu.type = org.apache.flume.sink.kafka.KafkaSinkproducer.sinks.sbaidu.channel = cbaiduproducer.sinks.sbaidu.topic = baiduproducer.sinks.sbaidu.brokerList = localhost:9797producer.sinks.sbaidu.requiredAcks = 1producer.sinks.sbaidu.batchSize = 20producer.sinks.sbaidu.metadata.broker.list = localhost:9092producer.sinks.sbaidu.producer.type=syncproducer.sinks.sbaidu.serializer.class=kafka.serializer.DefaultEncoderproducer.sinks.ssina.type = file_roll producer.sinks.ssina.channel = csinaproducer.sinks.ssina.sink.directory = /usr/local/flume/result producer.sinks.ssina.sink.rollInterval = 0 producer.sinks.ssina.sink.serializer = avro_eventproducer.sinks.sgoogle.type = org.apache.flume.sink.kafka.KafkaSinkproducer.sinks.sgoogle.channel = cgoogleproducer.sinks.sgoogle.topic = googleproducer.sinks.sgoogle.brokerList = localhost:9898producer.sinks.sgoogle.requiredAcks = 1producer.sinks.sgoogle.batchSize = 20producer.sinks.sgoogle.metadata.broker.list = localhost:9092producer.sinks.sgoogle.producer.type=syncproducer.sinks.sgoogle.serializer.class=kafka.serializer.DefaultEncoder
说明:flume source:syslogtcp、3个memory channel、3个sink(一个file roll,两个kafka sink)
    提取event中的域名字段,生成新的header(domain_name),根据此header将消息路由至不同的sink
    event举例:YM:baidu YDIP:63.12.79.4
                                             
0 0
原创粉丝点击