Flume一个数据源对应多个channel,多个sink
来源:互联网 发布:医疗大数据案例 编辑:程序博客网 时间:2024/06/05 20:58
原文链接:http://www.tuicool.com/articles/Z73UZf6
hadoop2 和hadoop3上的收集的数据 送到hadoop1集群上,然后hadoop1送到多个不同的目的。
一、概述
1、现在有三台机器,分别是:Hadoop1,Hadoop2,Hadoop3,以Hadoop1为日志汇总
2、Hadoop1汇总的同时往多个目标进行输出
3、Flume一个数据源对应多个channel,多个sink,是在consolidation-accepter.conf文件里配置的
二、部署Flume来采集日志和汇总日志
1、在Hadoop1上运行
flume-ng agent --conf ./ -f consolidation-accepter.conf -n agent1 -Dflume.root.logger=INFO,console
其脚本(consolidation-accepter.conf)内容如下
# Finally, now that we've defined all of our components, tell# agent1 which ones we want to activate.agent1.channels = ch1 ch2agent1.sources = source1agent1.sinks = hdfssink1 sink2agent1.source.source1.selector.type = replicating# Define a memory channel called ch1 on agent1agent1.channels.ch1.type = memoryagent1.channels.ch1.capacity = 1000000agent1.channels.ch1.transactionCapacity = 1000000agent1.channels.ch1.keep-alive = 10agent1.channels.ch2.type = memoryagent1.channels.ch2.capacity = 1000000agent1.channels.ch2.transactionCapacity = 100000agent1.channels.ch2.keep-alive = 10# Define an Avro source called avro-source1 on agent1 and tell it# to bind to 0.0.0.0:41414. Connect it to channel ch1.agent1.sources.source1.channels = ch1 ch2agent1.sources.source1.type = avroagent1.sources.source1.bind = conagent1.sources.source1.port = 44444agent1.sources.source1.threads = 5# Define a logger sink that simply logs all events it receives# and connect it to the other end of the same channel.agent1.sinks.hdfssink1.channel = ch1agent1.sinks.hdfssink1.type = hdfsagent1.sinks.hdfssink1.hdfs.path = hdfs://mycluster/flume/%Y-%m-%d/%H%Magent1.sinks.hdfssink1.hdfs.filePrefix = S1PA124-consolidation-accesslog-%H-%M-%Sagent1.sinks.hdfssink1.hdfs.useLocalTimeStamp = trueagent1.sinks.hdfssink1.hdfs.writeFormat = Textagent1.sinks.hdfssink1.hdfs.fileType = DataStreamagent1.sinks.hdfssink1.hdfs.rollInterval = 1800agent1.sinks.hdfssink1.hdfs.rollSize = 5073741824agent1.sinks.hdfssink1.hdfs.batchSize = 10000agent1.sinks.hdfssink1.hdfs.rollCount = 0agent1.sinks.hdfssink1.hdfs.round = trueagent1.sinks.hdfssink1.hdfs.roundValue = 60agent1.sinks.hdfssink1.hdfs.roundUnit = minuteagent1.sinks.sink2.type = loggeragent1.sinks.sink2.sink.batchSize=10000agent1.sinks.sink2.sink.batchTimeout=600000agent1.sinks.sink2.sink.rollInterval = 1000agent1.sinks.sink2.sink.directory=/root/data/flume-logs/agent1.sinks.sink2.sink.fileName=accesslogagent1.sinks.sink2.channel = ch2
2、分别在Hadoop2和Hadoop3运行如下命令
flume-ng agent --conf ./ --conf-file collect-send.conf --name agent2
Flume数据发送器配置文件collect-send.conf内容如下
agent2.sources = source2agent2.sinks = sink1agent2.channels = ch2agent2.sources.source2.type = execagent2.sources.source2.command = tail -F /root/data/flume.logagent2.sources.source2.channels = ch2#channels configurationagent2.channels.ch2.type = memoryagent2.channels.ch2.capacity = 10000agent2.channels.ch2.transactionCapacity = 10000agent2.channels.ch2.keep-alive = 3#sinks configurationagent2.sinks.sink1.type = avroagent2.sinks.sink1.hostname=consolidationIpAddressagent2.sinks.sink1.port = 44444agent2.sinks.sink1.channel = ch2
1、启动Flume汇总进程 flume-ng agent --conf ./ -f consolidation-accepter.conf -n agent1 -Dflume.root.logger=INFO,console2、启动Flume采集进程 flume-ng agent --conf ./ --conf-file collect-send.conf --name agent23、配置参数说明(以下两个条件是or的关系,也就是当一个条件满足就触发)(1)每半小时把channel里的数据冲刷到sink中去,并且另起新的文件来存储 agent1.sinks.hdfssink1.hdfs.rollInterval = 1800(2)当文件大小为5073741824字节时,另起新的文件来存储 agent1.sinks.hdfssink1.hdfs.rollSize = 5073741824
安装参考: http://blog.csdn.net/panguoyuan/article/details/39555239
用户手册参考: http://flume.apache.org/FlumeUserGuide.html
0 0
- Flume一个数据源对应多个channel,多个sink
- Flume一个数据源对应多个channel,多个sink
- 【Flume】flume多个输出分支的配置,多sink,多channel
- flume简单案例2 多个source一个sink
- 多个Flume合并一个channel上传文件到Hdfs
- Flume单Channel多Sink配置
- 5.Flume 多个channel sinks
- flume之source,channel,sink
- flume的source, channel, sink 列表
- flume组件汇总 (source, channel, sink)
- Flume笔记二之source,channel,sink
- flume的source, channel, sink 列表
- Flume组件汇总 source、sink、channel
- flume中几种常见的source、channel、sink
- flume 自定义source,sink,channel,拦截器
- 关于一个Form对应多个Action
- 关于多个Action 对应 一个 FormBean
- Qt多个signals对应一个slot
- 机器学习(六):黑箱方法——神经网络和支持向量机
- vtk-imageblend图像融合报错
- 还原mysql数据库注意
- Activity启动模式介绍
- tomcat中Server.xml配置文件
- Flume一个数据源对应多个channel,多个sink
- 欢迎使用CSDN-markdown编辑器
- Tomcat server.xml配置示例
- 预处理器、编译器、汇编、链接
- JVM虚拟机内存管理
- test
- ubuntu12.04安装shadowsocks过程
- 机器学习感悟:不要完全相信文献
- 数据仓库hive安装配置——腾讯云