flume配置-生产环境下从文件目录下将日志上传到s3

来源:互联网 发布:淘宝店铺收藏 编辑:程序博客网 时间:2024/05/29 07:49

生产环境下将收集到的日志上传至s3,采用多个spoolDir soure 和多个hdfs sink的方式是为了提高读取数据,上传数据的吞吐量。


clog.sources = source_log1 source_log2 clog.channels = channel_logclog.sinks = sink_log1 sink_log2 sink_log3 sink_log4 sink_log5 sink_log6 clog.sources.source_log1.type = spooldirclog.sources.source_log1.spoolDir = /home/data/log1clog.sources.source_log1.deletePolicy = immediateclog.sources.source_log1.batchSize = 1000clog.sources.source_log1.deserializer.maxLineLength = 999999clog.sources.source_log1.basenameHeader = trueclog.sources.source_log1.ignorePattern = ^[^0-9].*clog.sources.source_log1.decodeErrorPolicy = IGNOREclog.sources.source_log1.interceptors = i1clog.sources.source_log1.interceptors.i1.type = org.apache.flume.interceptor.RegexExtractorHeaderInterceptor$Builderclog.sources.source_log1.interceptors.i1.regex = (\\d{8})(\\d{2})(\\d{2})-(.*)-(.*)-(.*)-(.*)-(.*)\\.logclog.sources.source_log1.interceptors.i1.serializers=s1 s2 s3 s4 s5 s6 s7 s8clog.sources.source_log1.interceptors.i1.serializers.s1.name=dayclog.sources.source_log1.interceptors.i1.serializers.s2.name=hourclog.sources.source_log1.interceptors.i1.serializers.s3.name=minuteclog.sources.source_log1.interceptors.i1.serializers.s4.name=projectclog.sources.source_log1.interceptors.i1.serializers.s5.name=machineclog.sources.source_log1.interceptors.i1.serializers.s6.name=regionclog.sources.source_log1.interceptors.i1.serializers.s7.name=moduleclog.sources.source_log1.interceptors.i1.serializers.s8.name=serviceclog.sources.source_log1.channels = channel_logclog.sources.source_log2.type = spooldirclog.sources.source_log2.spoolDir = /home/data/log2clog.sources.source_log2.deletePolicy = immediateclog.sources.source_log2.batchSize = 1000clog.sources.source_log2.deserializer.maxLineLength = 999999clog.sources.source_log2.basenameHeader = trueclog.sources.source_log2.ignorePattern = ^[^0-9].*clog.sources.source_log2.decodeErrorPolicy = IGNOREclog.sources.source_log2.interceptors = i1clog.sources.source_log2.interceptors.i1.type = org.apache.flume.interceptor.RegexExtractorHeaderInterceptor$Builderclog.sources.source_log2.interceptors.i1.regex = (\\d{8})(\\d{2})(\\d{2})-(.*)-(.*)-(.*)-(.*)-(.*)\\.logclog.sources.source_log2.interceptors.i1.serializers=s1 s2 s3 s4 s5 s6 s7 s8clog.sources.source_log2.interceptors.i1.serializers.s1.name=dayclog.sources.source_log2.interceptors.i1.serializers.s2.name=hourclog.sources.source_log2.interceptors.i1.serializers.s3.name=minuteclog.sources.source_log2.interceptors.i1.serializers.s4.name=projectclog.sources.source_log2.interceptors.i1.serializers.s5.name=machineclog.sources.source_log2.interceptors.i1.serializers.s6.name=regionclog.sources.source_log2.interceptors.i1.serializers.s7.name=moduleclog.sources.source_log2.interceptors.i1.serializers.s8.name=serviceclog.sources.source_log2.channels = channel_logclog.sinks.sink_log1.type = hdfsclog.sinks.sink_log1.hdfs.path = s3n://aws_access_key_idaws_access_key_id:aws_secret_access_key/%{service}/%{day}/%{hour}clog.sinks.sink_log1.hdfs.filePrefix = %{minute}clog.sinks.sink_log1.hdfs.fileSuffix = .1.lzoclog.sinks.sink_log1.hdfs.rollSize = 0clog.sinks.sink_log1.hdfs.rollCount = 0clog.sinks.sink_log1.hdfs.rollInterval = 0clog.sinks.sink_log1.hdfs.idleTimeout = 180clog.sinks.sink_log1.hdfs.callTimeout = 600000clog.sinks.sink_log1.hdfs.batchSize = 1000clog.sinks.sink_log1.hdfs.codeC = lzopclog.sinks.sink_log1.hdfs.fileType = CompressedStreamclog.sinks.sink_log1.hdfs.writeFormat = Textclog.sinks.sink_log1.channel = channel_logclog.sinks.sink_log2.type = hdfsclog.sinks.sink_log2.hdfs.path = s3n://aws_access_key_idaws_access_key_id:aws_secret_access_key/%{service}/%{day}/%{hour}clog.sinks.sink_log2.hdfs.filePrefix = %{minute}clog.sinks.sink_log2.hdfs.fileSuffix = .2.lzoclog.sinks.sink_log2.hdfs.rollSize = 0clog.sinks.sink_log2.hdfs.rollCount = 0clog.sinks.sink_log2.hdfs.rollInterval = 0clog.sinks.sink_log2.hdfs.idleTimeout = 180clog.sinks.sink_log2.hdfs.callTimeout = 600000clog.sinks.sink_log2.hdfs.batchSize = 1000clog.sinks.sink_log2.hdfs.codeC = lzopclog.sinks.sink_log2.hdfs.fileType = CompressedStreamclog.sinks.sink_log2.hdfs.writeFormat = Textclog.sinks.sink_log2.channel = channel_logclog.sinks.sink_log3.type = hdfsclog.sinks.sink_log3.hdfs.path = s3n://aws_access_key_idaws_access_key_id:aws_secret_access_key/%{service}/%{day}/%{hour}clog.sinks.sink_log3.hdfs.filePrefix = %{minute}clog.sinks.sink_log3.hdfs.fileSuffix = .3.lzoclog.sinks.sink_log3.hdfs.rollSize = 0clog.sinks.sink_log3.hdfs.rollCount = 0clog.sinks.sink_log3.hdfs.rollInterval = 0clog.sinks.sink_log3.hdfs.idleTimeout = 180clog.sinks.sink_log3.hdfs.callTimeout = 600000clog.sinks.sink_log3.hdfs.batchSize = 1000clog.sinks.sink_log3.hdfs.codeC = lzopclog.sinks.sink_log3.hdfs.fileType = CompressedStreamclog.sinks.sink_log3.hdfs.writeFormat = Textclog.sinks.sink_log3.channel = channel_logclog.sinks.sink_log4.type = hdfsclog.sinks.sink_log4.hdfs.path = s3n://aws_access_key_idaws_access_key_id:aws_secret_access_key/%{service}/%{day}/%{hour}clog.sinks.sink_log4.hdfs.filePrefix = %{minute}clog.sinks.sink_log4.hdfs.fileSuffix = .4.lzoclog.sinks.sink_log4.hdfs.rollSize = 0clog.sinks.sink_log4.hdfs.rollCount = 0clog.sinks.sink_log4.hdfs.rollInterval = 0clog.sinks.sink_log4.hdfs.idleTimeout = 180clog.sinks.sink_log4.hdfs.callTimeout = 600000clog.sinks.sink_log4.hdfs.batchSize = 1000clog.sinks.sink_log4.hdfs.codeC = lzopclog.sinks.sink_log4.hdfs.fileType = CompressedStreamclog.sinks.sink_log4.hdfs.writeFormat = Textclog.sinks.sink_log4.channel = channel_logclog.sinks.sink_log5.type = hdfsclog.sinks.sink_log5.hdfs.path = s3n://aws_access_key_idaws_access_key_id:aws_secret_access_key/%{service}/%{day}/%{hour}clog.sinks.sink_log5.hdfs.filePrefix = %{minute}clog.sinks.sink_log5.hdfs.fileSuffix = .5.lzoclog.sinks.sink_log5.hdfs.rollSize = 0clog.sinks.sink_log5.hdfs.rollCount = 0clog.sinks.sink_log5.hdfs.rollInterval = 0clog.sinks.sink_log5.hdfs.idleTimeout = 180clog.sinks.sink_log5.hdfs.callTimeout = 600000clog.sinks.sink_log5.hdfs.batchSize = 1000clog.sinks.sink_log5.hdfs.codeC = lzopclog.sinks.sink_log5.hdfs.fileType = CompressedStreamclog.sinks.sink_log5.hdfs.writeFormat = Textclog.sinks.sink_log5.channel = channel_logclog.sinks.sink_log6.type = hdfsclog.sinks.sink_log6.hdfs.path = s3n://aws_access_key_idaws_access_key_id:aws_secret_access_key/%{service}/%{day}/%{hour}clog.sinks.sink_log6.hdfs.filePrefix = %{minute}clog.sinks.sink_log6.hdfs.fileSuffix = .6.lzoclog.sinks.sink_log6.hdfs.rollSize = 0clog.sinks.sink_log6.hdfs.rollCount = 0clog.sinks.sink_log6.hdfs.rollInterval = 0clog.sinks.sink_log6.hdfs.idleTimeout = 180clog.sinks.sink_log6.hdfs.callTimeout = 600000clog.sinks.sink_log6.hdfs.batchSize = 1000clog.sinks.sink_log6.hdfs.codeC = lzopclog.sinks.sink_log6.hdfs.fileType = CompressedStreamclog.sinks.sink_log6.hdfs.writeFormat = Textclog.sinks.sink_log6.channel = channel_logclog.channels.channel_log.type = memoryclog.channels.channel_log.capacity = 100000clog.channels.channel_log.transactionCapacity = 10000

 

0 0
原创粉丝点击