日志采集框架--Flume

来源:互联网 发布:java图书销售管理系统 编辑:程序博客网 时间:2024/06/05 09:35

这里写图片描述

日志收集框架–flume

webServer(源端) –> flume –> hdfs(目的地)

这里写图片描述

flume框架核心组件

source: 日志来源

channel: 渠道,数据处理管道

sink:存储目的地(要下落的地方)

jdk下载安装

下载:jdk-8-linux-x64.tar.gz

上传:rz

解压:tar -zvxf jdk-8-linux-x64.tar.gz -C ~/soft_install/

配置配置文件:vi ~/.bash_profile

export JAVA_HOME = /root/soft_install/jdk1.8.0

export PATH = $JAVA_HOME/bin:$PATH

source ~/.bash_profile

检测:java -version

flume下载安装

一:

下载:http://archive.cloudera.com/cdh5/cdh/5/

上传:rz

解压:tar -zvxf flume-ng-1.6.0-cdh5.7.0.tar.gz -C ~/soft_install/

配置配置文件:vi ~/.bash_profile

>### export FLUME_HOME = /root/soft_install/apache-flume-1.6.0-cdh5.7.0-bin

export PATH = $FLUME_HOME/bin:$PATH

source ~/.bash_profile

二:

配置conf下配置文件:

cp flume-env.sh.template flume-env.sh

vi flume-env.sh 添加 JAVA_HOME = /root/soft_install/jdk1.8.0

检测:flume-ng version

启动flume配置文件

 flume-ng agent \  --name avro-memory-logger \  --conf $FLUME_HOME/conf \  --conf-file $FLUME_HOME/conf/exampleB.conf \  -Dflume.root.logger=INFO,console 

Event

Event: { headers:{} body: 69 20 6C 6F 76 65 20 6C 69 66 08 6E 66 65 69 66 i love lif.nfeif }

Event是flume中数据传输的基本单元

Event = 可选的header + bye array

flume核心就在于配置文件,新增一个配置文件,指定agent、source、channel、sink

关键是选择何种source、channel、sink

实战一:从指定的网络端口采集(获取)日志信息,并打印在控制台上

技术选型:netcat source + memory channel + logger sink

一: vi example.conf – 详见配置文件

二: 启动

  flume-ng agent \  --name a1 \  --conf $FLUME_HOME/conf \  --conf-file $FLUME_HOME/conf/exampleB.conf \  -Dflume.root.logger=INFO,console  

三:测试

另开一个窗口:telnet 192.168.145.128 44444 – 查询原窗口是否有日志信息打印

实战二:实时监控一个文件新增的内容

技术选型:exec source + memory channel + logger sink

一: vi example2.conf – 详见配置文件

二: 启动 – 最后一句是打印info级别的日志到控制台上

flume-ng agent \

–name a1 \
–conf FLUMEHOME/conf conffileFLUME_HOME/conf/example2.conf \
-Dflume.root.logger=INFO,console

三:测试

另开一个窗口:telnet 192.168.145.128 44444 – 查询原窗口是否有日志信息打印

实战二进阶–离线处理

将收到的日志信息保存到hdfs中

技术选型:exec source + memory channel + hdfs sink

example3.conf

日志采集过程

机器A监控一个文件,将结果 (avro) sink 到另一个节点

机器B采用(avro) source接受 机器A sink的数据

机器B可采用logger将数据打印在控制台,或者保存,或者(kafka)

example1.conf

a1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = netcata1.sources.r1.binf = hadoop01a1.sources.r1.port = 44444# Describe/ the sinka1.sinks.k1.type = logger# Use a channel which buffers events in memorya1.channels.c1.type = memorya1.channels.c1.capacity = 1000a1.channels.c1.transactionCapacity = 100# Bind the source and sink to the channela1.sources.r1.channels = c1a1.sinks.k1.channel = c1

example2.conf

a1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = execa1.sources.r1.command = tail -F  /root/data/example2.txta1.sources.r1.shell = /bin/sh -c# Describe/ the sinka1.sinks.k1.type = logger# Use a channel which buffers events in memorya1.channels.c1.type = memory# Bind the source and sink to the channela1.sources.r1.channels = c1a1.sinks.k1.channel = c1

example3.conf

a1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = execa1.sources.r1.command = tail -F  /root/data/example2.txta1.sources.r1.shell = /bin/sh -c# Describe/ the sinka1.sinks.k1.type = hdfsa1.sinks.k1.hdfs.path = hdfs://192.168.145.128:8020# Use a channel which buffers events in memorya1.channels.c1.type = memory# Bind the source and sink to the channela1.sources.r1.channels = c1a1.sinks.k1.channel = c1

exampleA.conf

# example exec-memory-avroexec-memory-avro.sources = exec-sourceexec-memory-avro.sinks = avro-sinkexec-memory-avro.channels = memory-channel# Describe/configure the sourceexec-memory-avro.sources.exec-source.type = execexec-memory-avro.sources.exec-source.command = tail -F  /root/data/exampleA.txtexec-memory-avro.sources.exec-source.shell = /bin/sh -c# Describe/ the sinkexec-memory-avro.sinks.avro-sink.type = avroexec-memory-avro.sinks.avro-sink.hostname = 192.168.145.128exec-memory-avro.sinks.avro-sink.port = 44444 # Use a channel which buffers events in memoryexec-memory-avro.channels.memory-channel.type = memory# Bind the source and sink to the channelexec-memory-avro.sources.exec-source.channels = memory-channelexec-memory-avro.sinks.avro-sink.channel = memory-channel

exampleB.conf

# example avro-memory-loggeravro-memory-logger.sources = avro-sourceavro-memory-logger.sinks = logger-sinkavro-memory-logger.channels = memory-channel# Describe/configure the sourceavro-memory-logger.sources.avro-source.type = avroavro-memory-logger.sources.avro-source.bind = 192.168.145.128avro-memory-logger.sources.avro-source.port = 44444# Describe/ the sinkavro-memory-logger.sinks.logger-sink.type = logger# Use a channel which buffers events in memoryavro-memory-logger.channels.memory-channel.type = memory# Bind the source and sink to the channelavro-memory-logger.sources.avro-source.channels = memory-channelavro-memory-logger.sinks.logger-sink.channel = memory-channel
原创粉丝点击