flume安装与配置

来源:互联网 发布:淘宝有的东西搜不到 编辑:程序博客网 时间:2024/06/05 00:59

flume-1.5.0安装包下载(flume NG):http://download.csdn.net/detail/vinsuan1993/9836334

安装环境:centOS-6.5-64位

1、需求:在一台机器部署flume,让其收集数据并将数据写到hdfs中。
2、安装flume(flume并不依赖于hadoop框架,只依赖JDK和一些hadoop的jar包)
2.1、将flume压缩包上传到centOS
2.2、解压flume
tar -zxvf apache-flume... /heres/
2.3、配置flume
2.3.1、进入conf目录
mv flume-env.sh.template flume-env.sh
2.3.2、把java_home导进来
JAVA_HOME=/usr/java/jdk...
2.3.3、配置sources、channels、sinks 、组装信息(SpoolSource-source,对于source类型可以参看博客:http://blog.csdn.net/vinsuan1993/article/details/71374383

#定义agent名, source、channel、sink的名称a4.sources = r1a4.channels = c1a4.sinks = k1#具体定义source,有一个spoolidr实现类,通过读这个配置文件,利用反射将这个类实例化a4.sources.r1.type = spooldir#监听这个目录a4.sources.r1.spoolDir = /root/logs#具体定义channela4.channels.c1.type = memory#容量a4.channels.c1.capacity = 10000a4.channels.c1.transactionCapacity = 100#定义拦截器,为消息添加时间戳a4.sources.r1.interceptors = i1a4.sources.r1.interceptors.i1.type = org.apache.flume.interceptor.TimestampInterceptor$Builder#具体定义sinka4.sinks.k1.type = hdfsa4.sinks.k1.hdfs.path = hdfs://ns1/flume/%Y%m%d#产生日志的前缀a4.sinks.k1.hdfs.filePrefix = events-#纯文本a4.sinks.k1.hdfs.fileType = DataStream#不按照条数生成文件a4.sinks.k1.hdfs.rollCount = 0#HDFS上的文件达到128M时生成一个文件a4.sinks.k1.hdfs.rollSize = 134217728#HDFS上的文件达到60秒生成一个文件,两个条件只要满足一个,就可以写入HDFSa4.sinks.k1.hdfs.rollInterval = 60#组装source、channel、sinka4.sources.r1.channels = c1a4.sinks.k1.channel = c1


2.3.4、将上面的配置文件拷贝到flume的conf目录
cp /root/a4.conf /heres/flume/conf
2.3.5、拷贝hadoop的包
scp hadoop-common-2.2.0.jar commons-configration-1.6.jar hadoop-auth-2.2.0.jar hadoop-hdfs-2.2.0.jar 192.168.2.113:/heres/apache-flume.../lib
2.3.6、把关于hdfs的配置文件拷贝到flume
scp /heres/hadoop-2.2.0/etc/hadoop/{core-site.xml,hdfs-site.xml} 192.168.2.113:/heres/apache-flume.../conf
2.3.7、配置ip地址映射
vim /etc/hosts
2.3.7、启动flume -n agent的名称 -C 读配置信息 -f 决定三大组件类型 把日志打印出来
bin/flume-ng agent -n a4 -c conf -f conf/a4.conf -Dflume.root.logger=INFO,console




3、向目录中丢数据
cp access... /root/logs

4、附(exec-source):

#bin/flume-ng agent -n a2 -f /home/hadoop/a2.conf -c conf -Dflume.root.logger=INFO,console#定义agent名, source、channel、sink的名称a2.sources = r1a2.channels = c1a2.sinks = k1#具体定义sourcea2.sources.r1.type = execa2.sources.r1.command = tail -F /home/hadoop/a.log#具体定义channela2.channels.c1.type = memorya2.channels.c1.capacity = 1000a2.channels.c1.transactionCapacity = 100#具体定义sinka2.sinks.k1.type = logger#组装source、channel、sinka2.sources.r1.channels = c1a2.sinks.k1.channel = c1









0 0
原创粉丝点击