日志采集框架--Flume
来源:互联网 发布:java图书销售管理系统 编辑:程序博客网 时间:2024/06/05 09:35
日志收集框架–flume
webServer(源端) –> flume –> hdfs(目的地)
flume框架核心组件
source: 日志来源
channel: 渠道,数据处理管道
sink:存储目的地(要下落的地方)
jdk下载安装
下载:jdk-8-linux-x64.tar.gz
上传:rz
解压:tar -zvxf jdk-8-linux-x64.tar.gz -C ~/soft_install/
配置配置文件:vi ~/.bash_profile
export JAVA_HOME = /root/soft_install/jdk1.8.0
export PATH = $JAVA_HOME/bin:$PATH
source ~/.bash_profile
检测:java -version
flume下载安装
一:
下载:http://archive.cloudera.com/cdh5/cdh/5/
上传:rz
解压:tar -zvxf flume-ng-1.6.0-cdh5.7.0.tar.gz -C ~/soft_install/
配置配置文件:vi ~/.bash_profile
>###
export FLUME_HOME = /root/soft_install/apache-flume-1.6.0-cdh5.7.0-bin
export PATH = $FLUME_HOME/bin:$PATH
source ~/.bash_profile
二:
配置conf下配置文件:
cp flume-env.sh.template flume-env.sh
vi flume-env.sh 添加 JAVA_HOME = /root/soft_install/jdk1.8.0
检测:
flume-ng version
启动flume配置文件
flume-ng agent \ --name avro-memory-logger \ --conf $FLUME_HOME/conf \ --conf-file $FLUME_HOME/conf/exampleB.conf \ -Dflume.root.logger=INFO,console
Event
Event: { headers:{} body: 69 20 6C 6F 76 65 20 6C 69 66 08 6E 66 65 69 66 i love lif.nfeif }
Event是flume中数据传输的基本单元
Event = 可选的header + bye array
flume核心就在于配置文件,新增一个配置文件,指定agent、source、channel、sink
关键是选择何种source、channel、sink
实战一:从指定的网络端口采集(获取)日志信息,并打印在控制台上
技术选型:netcat source + memory channel + logger sink
一: vi example.conf – 详见配置文件
二: 启动
flume-ng agent \ --name a1 \ --conf $FLUME_HOME/conf \ --conf-file $FLUME_HOME/conf/exampleB.conf \ -Dflume.root.logger=INFO,console
三:测试
另开一个窗口:telnet 192.168.145.128 44444 – 查询原窗口是否有日志信息打印
实战二:实时监控一个文件新增的内容
技术选型:exec source + memory channel + logger sink
一: vi example2.conf – 详见配置文件
二: 启动 – 最后一句是打印info级别的日志到控制台上
flume-ng agent \
–name a1 \
–confFLUMEHOME/conf –conf−file FLUME_HOME/conf/example2.conf \
-Dflume.root.logger=INFO,console三:测试
另开一个窗口:telnet 192.168.145.128 44444 – 查询原窗口是否有日志信息打印
实战二进阶–离线处理
将收到的日志信息保存到hdfs中
技术选型:exec source + memory channel + hdfs sink
example3.conf
日志采集过程
机器A监控一个文件,将结果 (avro) sink 到另一个节点
机器B采用(avro) source接受 机器A sink的数据
机器B可采用logger将数据打印在控制台,或者保存,或者(kafka)
example1.conf
a1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = netcata1.sources.r1.binf = hadoop01a1.sources.r1.port = 44444# Describe/ the sinka1.sinks.k1.type = logger# Use a channel which buffers events in memorya1.channels.c1.type = memorya1.channels.c1.capacity = 1000a1.channels.c1.transactionCapacity = 100# Bind the source and sink to the channela1.sources.r1.channels = c1a1.sinks.k1.channel = c1
example2.conf
a1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = execa1.sources.r1.command = tail -F /root/data/example2.txta1.sources.r1.shell = /bin/sh -c# Describe/ the sinka1.sinks.k1.type = logger# Use a channel which buffers events in memorya1.channels.c1.type = memory# Bind the source and sink to the channela1.sources.r1.channels = c1a1.sinks.k1.channel = c1
example3.conf
a1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = execa1.sources.r1.command = tail -F /root/data/example2.txta1.sources.r1.shell = /bin/sh -c# Describe/ the sinka1.sinks.k1.type = hdfsa1.sinks.k1.hdfs.path = hdfs://192.168.145.128:8020# Use a channel which buffers events in memorya1.channels.c1.type = memory# Bind the source and sink to the channela1.sources.r1.channels = c1a1.sinks.k1.channel = c1
exampleA.conf
# example exec-memory-avroexec-memory-avro.sources = exec-sourceexec-memory-avro.sinks = avro-sinkexec-memory-avro.channels = memory-channel# Describe/configure the sourceexec-memory-avro.sources.exec-source.type = execexec-memory-avro.sources.exec-source.command = tail -F /root/data/exampleA.txtexec-memory-avro.sources.exec-source.shell = /bin/sh -c# Describe/ the sinkexec-memory-avro.sinks.avro-sink.type = avroexec-memory-avro.sinks.avro-sink.hostname = 192.168.145.128exec-memory-avro.sinks.avro-sink.port = 44444 # Use a channel which buffers events in memoryexec-memory-avro.channels.memory-channel.type = memory# Bind the source and sink to the channelexec-memory-avro.sources.exec-source.channels = memory-channelexec-memory-avro.sinks.avro-sink.channel = memory-channel
exampleB.conf
# example avro-memory-loggeravro-memory-logger.sources = avro-sourceavro-memory-logger.sinks = logger-sinkavro-memory-logger.channels = memory-channel# Describe/configure the sourceavro-memory-logger.sources.avro-source.type = avroavro-memory-logger.sources.avro-source.bind = 192.168.145.128avro-memory-logger.sources.avro-source.port = 44444# Describe/ the sinkavro-memory-logger.sinks.logger-sink.type = logger# Use a channel which buffers events in memoryavro-memory-logger.channels.memory-channel.type = memory# Bind the source and sink to the channelavro-memory-logger.sources.avro-source.channels = memory-channelavro-memory-logger.sinks.logger-sink.channel = memory-channel
阅读全文
0 0
- 日志采集框架Flume
- 日志采集框架Flume
- Flume日志采集框架
- 日志采集框架Flume
- 日志采集框架--Flume
- flume日志采集
- flume日志采集
- Flume日志采集
- flume日志采集
- flume日志采集
- Flume日志采集
- 日志采集框架Flume的安装及使用
- 日志采集框架Flume、Flume介绍、概述、运行机制、Flume采集系统结构图(1、简单结构、复杂结构)
- Flume采集处理日志文件
- 【备忘】Flume日志采集视频教程
- Flume日志采集多级Agent
- Hive采集数据框架flume
- 日志采集框架Flume以及Flume的安装部署(一个分布式、可靠、和高可用的海量日志采集、聚合和传输的系统)
- php学习的笔记。包括ubuntu16.04不显示php代码的验证码(windeow下正常)等等,实时更新
- Codeforces Round #448 (Div. 2) C. Square Subsets(状压)
- 计算几何实践1:基础
- PHP实现监听事件
- AndroidStudio新建一个大项目文件夹,内含两个可独立运行的app模块,一个公共library模块
- 日志采集框架--Flume
- LeetCode简易题解--084,085
- 机器学习(2)数学基础知识-线程代数
- Lintcode:A + B 问题
- shell脚本下查看日志文件
- 1372Problem B:几何题(一)
- JAVA学习笔记——Collection工具类Collections
- 程序员薪酬到底有多高?来看硅谷的工程师统计
- 优秀的人大多不合群