5.Flume实时监控读取日志数据,存储hdfs文件系统

来源:互联网 发布:mac os 10.12.6 降级 编辑:程序博客网 时间:2024/06/05 02:01

1.收集Hive运行日志logs
命令

tail -f  /opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6/logs/hive.log

2.配置文件

# The configuration file needs to define the sources, # the channels and the sinks.### define agenta2.sources = r2a2.channels = c2a2.sinks = k2### define sourcesa2.sources.r2.type = execa2.sources.r2.command = tail -f /opt/cdh-5.3.6/hive-0.13.1-cdh5.3.6/logs/hive.loga2.sources.r2.shell = /bin/bash -c### define channelsa2.channels.c2.type = memorya2.channels.c2.capacity = 1000a2.channels.c2.transactionCapacity = 100### define sinka2.sinks.k2.type = hdfsa2.sinks.k2.hdfs.path = hdfs://hadoop-CDH:8020/user/beifeng/flume/hive-logs/#a2.sinks.k2.hdfs.path = hdfs://ns1/user/flume/hive-logs/ //当HDFS配置了高可用HA时a2.sinks.k2.hdfs.fileType = DataStream a2.sinks.k2.hdfs.writeFormat = Texta2.sinks.k2.hdfs.batchSize = 10### bind the soures and  sink to the channela2.sources.r2.channels = c2a2.sinks.k2.channel = c2

注意,当HDFS是配置了HA的时候,需要将hadoop的配置文件core-site.xml以及hdfs-site.xml等拷贝到Flume的配置文件夹conf下。
3.运行

bin/flume-ng agent \-c conf \-n a2 \-f conf/flume-tail.conf \-Dflume.root.logger=DEBUG,console

执行成功

2017-05-03 09:37:54,283 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.hdfs.BucketWriter.open(BucketWriter.java:261)] Creating hdfs://hadoop-CDH:8020/user/beifeng/flume/hive-logs//FlumeData.1493818674204.tmp

tmp是默认格式,可以改变

1 0
原创粉丝点击