hadoop集群配置flume
来源:互联网 发布:郑州淘宝网店加盟 编辑:程序博客网 时间:2024/05/17 07:53
1.先从官网下载flume的jar包。我们下载最新的Apache Flume binary (tar.gz) 1.8版本
地址:http://www.apache.org/dyn/closer.lua/flume/1.8.0/apache-flume-1.8.0-bin.tar.gz
2.之后再hadoop中解压
tar -zxvf apache-flume-1.8.0-bin.tar.gz
然后我们可以改变文件名,这样好操作(可以不改)
mv apache-flume-1.8.0-bin.tar.gz flume
3.之后前往/etc/profile配置环境变量(不配也可以,配了之后命令好操作)
export ZOOKEEPER_HOME=/home/zookeeper
export PATH=$PATH:$ZOOKEEPER_HOME/bin:/home/flume/bin
4.之后创建一个文件test
# example.conf: A single-node Flume configuration# Name the components on this agenta1.sources = r1a1.sinks = k1a1.channels = c1# Describe/configure the sourcea1.sources.r1.type = netcata1.sources.r1.bind = node1a1.sources.r1.port = 44444# Describe the sinka1.sinks.k1.type = logger# Use a channel which buffers events in memorya1.channels.c1.type = memorya1.channels.c1.capacity = 1000a1.channels.c1.transactionCapacity = 100# Bind the source and sink to the channela1.sources.r1.channels = c1a1.sinks.k1.channel = c1
注意:a1.sources.r1.bind = localhost(改成你的主机地址)
5.运行
$ bin/flume-ng agent --conf conf --conf-file example.conf --name a1 -Dflume.root.logger=INFO,console注意:由于设置source设置的是netcat,是tcp 所以我们要使用tcp协议的输入
我们可以使用telnet 192.168.56.101 44444来检测日志是否生成。
最后我们
a1.sinks.k1.type = logger我们sink输出的是logger所以就在node1节点打印输出6.上面的netcat是被动输出日志的,我们还可以使用一种主动地生成日志
# example.conf: A single-node Flume configuration
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /opt/flume
# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path = hdfs://192.168.56.101:8020/flume/%Y-%m-%d/%H%M
a1.sinks.k1.hdfs.rollCount = 0
a1.sinks.k1.hdfs.rollInterval = 60
a1.sinks.k1.hdfs.rollSize = 10240
a1.sinks.k1.hdfs.idleTimeout = 3
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.useLocalTimeStamp = true
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 5
a1.sinks.k1.hdfs.roundUnit = minute
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
7.spooldir 是一种主动生成日志的方式
之后我们往/opt/flume传入一个文件 发现日志文件在hadoop集群上面主动生成
如果遇到此错误
2017-11-22 21:22:26,047 (pool-3-thread-1) [ERROR - org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:280)] FATAL: Spool Directory source r1: { spoolDir: /opt/flume }: Uncaught exception in SpoolDirectorySource thread. Restart or reconfigure Flume to continue processing.
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:281)
at org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:283)
at org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:132)
at org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:70)
at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:89)
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readDeserializerEvents(ReliableSpoolingFileEventReader.java:343)
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:318)
at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:250)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
是因为你传入的文件有特殊符号,请更改就没错误了,至此单点的flume创建完成。/
- hadoop集群配置flume
- Hadoop集群之flume安装配置
- Hadoop之--flume安装配置
- flume之集群监控 Ganglia 部署配置
- 使用flume,伪分布式集群配置
- Hadoop集群配置
- Hadoop集群的配置
- Hadoop分布式集群配置
- Hadoop集群配置
- hadoop集群配置
- Hadoop集群配置
- Hadoop集群配置
- hadoop集群配置
- Hadoop集群配置详解
- Hadoop集群配置问题
- Hadoop集群配置
- Hadoop集群配置
- Hadoop 集群配置
- NOIP2017提高组D2T3[列队]
- C语言:用单链表实现输入排序
- 深入分析java线程池的实现原理
- 在MySql安装上踩过的大深坑
- 机器学习与深度学习视频讲解
- hadoop集群配置flume
- AndroidMainfest.xml笔记
- 时间序列差分后去除空值
- 小波变换教程(十)
- (转)TCP与UDP的区别
- 利用tensorflow训练自己的图片数据(1)——预处理
- 13.二进制中1的个数
- 图解排序算法(一)之3种简单排序(选择,冒泡,直接插入)
- python与C++的语法差异