关于linux环境下flume采集日志发布到kafka的配置
来源:互联网 发布:彩票数据分析 编辑:程序博客网 时间:2024/06/05 03:59
整体流程
flume配置
flume下载地址https://pan.baidu.com/s/1slNuhad 提取码:q81v
1.配置flume路径(配置环境变量)
$ cd /xx/flume (这里进入flume所在文件夹)$ vi .bash_profile(这里若没有该文件,会自动创建) export FLUME_HOME=/xx(flume所在文件夹路径)/apache-flume-1.6.0-cdh5.8.4-bin export PATH=$PATH:$FLUME_HOME/bin$ source .bash_profile
2.配置flume-env.sh文件
配置java路径$ cd /xx/apache-flume-1.6.0-cdh5.8.4-bin/conf$ vi flume-env.shexport JAVA_HOME=/xx/jdk1.8.0_121.jdk/export HADOOP_HOME=/xx/apache-flume-1.6.0-cdh5.8.4-bin
关于jdk配置两种方式:
1.引用环境中安装好的jdk路径
2.直接上传一个新的jdk,解压。直接配置该jdk路径
(关于jdk配置,网上也能找到。我也会在另外一篇blog中写出)
3.配置 xx.conf 文件
tier1.sources = source_ETE_SERV_SSPStier1.channels = channel_ETE_SERV_SSPS_kafkatier1.sinks = sink_ETE_SERV_SSPS_kafka#ETE_SERV_SSPStier1.sources.source_ETE_SERV_SSPS.type = TAILDIRtier1.sources.source_ETE_SERV_SSPS.positionFile = position/taildir_position_ETE_SERV_SSPS.jsontier1.sources.source_ETE_SERV_SSPS.filegroups = f1# 监控日志文件路径----------tier1.sources.source_ETE_SERV_SSPS.filegroups.f1 = /oss/ztracer/.*info*.*log(目标日志文件路径)----------tier1.sources.source_ETE_SERV_SSPS.idleTimeout = 8000#batchSize一般要大于等于transactionCapacitytier1.sources.source_ETE_SERV_SSPS.batchSize = 2000tier1.sources.source_ETE_SERV_SSPS.channels = channel_ETE_SERV_SSPS_kafkatier1.channels.channel_ETE_SERV_SSPS_kafka.type = memorytier1.channels.channel_ETE_SERV_SSPS_kafka.capacity = 100000tier1.channels.channel_ETE_SERV_SSPS_kafka.transactionCapacity = 2000tier1.sinks.sink_ETE_SERV_SSPS_kafka.type = org.apache.flume.sink.kafka.KafkaSinktier1.sinks.sink_ETE_SERV_SSPS_kafka.channel = channel_ETE_SERV_SSPS_kafka----------# kafka主题(监控日志将被放在kafka该主题下)tier1.sinks.sink_ETE_SERV_SSPS_kafka.topic = ETE_SERV_SSPS# kafka 连接信息,可配置多个tier1.sinks.sink_ETE_SERV_SSPS_kafka.brokerList = kafka01:9092,kafka01:9093,kafka01:9094,kafka02:9092,kafka02:9093,kafka02:9094----------tier1.sinks.sink_ETE_SERV_SSPS_kafka.batchSize = 1000
4.启动flume
./flume-ng agent -n tier1 -c /IBM/flume/apache-flume-1.6.0-cdh5.8.4-bin/conf -f /IBM/flume/apache-flume-1.6.0-cdh5.8.4-bin/conf/xx.conf -Dflume.root.logger=INFO,console
(像我的文件放置在这个目录下,启动即可)
这样,kafka就能接收到我们上传的日志了。
修改日志文件,kafka能实时接收到增量日志。
大功告成~
[1]linux环境如何查看jdk安装路径: https://www.cnblogs.com/kerrycode/archive/2015/08/27/4762921.html
[2]Linux下搭建kafka环境简易教程【转】:http://blog.csdn.net/aitcax/article/details/49583351
阅读全文