log4j+flume+kafka+strom整合
来源:互联网 发布:淘宝卖家好评回复模板 编辑:程序博客网 时间:2024/06/07 14:45
log4j+flume+kafka+strom整合
整合流程图。
第一步 整合log4j和flume(主机名为hadoop0)
(1) 安装flume在/usr/softinstall/下
(2)在/usr/softinstall/flume/conf 目录下建立文件lfks.conf
a1.channels = c1
a1.sources =s1
a1.sinks = k1
# 定义channel
a1.channels.c1.type = memory
# 定义source
a1.sources.s1.channels = c1
a1.sources.s1.type = avro
# 接受任何IP主机的日志。
a1.sources.s1.bind = 0.0.0.0
# 监听主机端口号。
a1.sources.s1.port = 41414
# 定义sink
a1.sinks.k1.channel = c1
a1.sinks.k1.type = logger
(3)建立maven项目往pom.xml导入jar依赖<dependencies>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.17</version>
</dependency>
<dependency>
<groupId>org.apache.flume.flume-ng-clients</groupId>
<artifactId>flume-ng-log4jappender</artifactId>
<version>1.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.flume</groupId>
<artifactId>flume-ng-core</artifactId>
<version>1.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.10</artifactId>
<version>0.8.2.1</version>
</dependency>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-core</artifactId>
<version>0.9.3</version>
</dependency>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka</artifactId>
<version>0.9.3</version>
</dependency>
</dependencies>
package com.east.lfks;
import org.apache.log4j.Logger;
public class LogsFlume {
private static Logger logger = Logger.getLogger(LogsFlume.class);
public static void main(String[] args) {
// System.out.println("This is println message.");
// 记录debug级别的信息
while(true){
logger.debug("This is debug message.");
// 记录info级别的信息
logger.info("my name is liu xiang ke");
// 记录error级别的信息
logger.error("my work is it i love my work.");
try {
Thread.sleep(6000);
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
}
log4j.rootLogger=INFO,flume
log4j.appender.flume = org.apache.flume.clients.log4jappender.Log4jAppender
log4j.appender.flume.Hostname = hadoop0
log4j.appender.flume.Port = 41414
log4j.appender.flume.UnsafeMode = true
bin/flume-ng agent -c conf -f conf/testconf/lfks.conf -n a1 -Dflume.root.logger=INFO,console
(6)运行LogsFlume 类,参考结果15/12/17 22:19:29 INFO sink.LoggerSink: Event: { headers:{flume.client.log4j.log.level=40000, flume.client.log4j.message.encoding=UTF8, flume.client.log4j.logger.name=com.east.testlfks.TestLogsFlume, flume.client.log4j.timestamp=1450419569909} body: 6D 79 20 77 6F 72 6B 20 69 73 20 69 74 20 20 69 my work is it i }
15/12/17 22:19:35 INFO sink.LoggerSink: Event: { headers:{flume.client.log4j.log.level=20000, flume.client.log4j.message.encoding=UTF8, flume.client.log4j.logger.name=com.east.testlfks.TestLogsFlume, flume.client.log4j.timestamp=1450419575911} body: 6D 79 20 6E 61 6D 65 20 69 73 20 6C 69 75 20 78 my name is liu x }
15/12/17 22:19:35 INFO sink.LoggerSink: Event: { headers:{flume.client.log4j.log.level=40000, flume.client.log4j.message.encoding=UTF8, flume.client.log4j.logger.name=com.east.testlfks.TestLogsFlume, flume.client.log4j.timestamp=1450419575925} body: 6D 79 20 77 6F 72 6B 20 69 73 20 69 74 20 20 69 my work is it i }
15/12/17 22:19:41 INFO sink.LoggerSink: Event: { headers:{flume.client.log4j.log.level=20000, flume.client.log4j.message.encoding=UTF8, flume.client.log4j.logger.name=com.east.testlfks.TestLogsFlume, flume.client.log4j.timestamp=1450419581929} body: 6D 79 20 6E 61 6D 65 20 69 73 20 6C 69 75 20 78 my name is liu x }
15/12/17 22:19:41 INFO sink.LoggerSink: Event: { headers:{flume.client.log4j.log.level=40000, flume.client.log4j.message.encoding=UTF8, flume.client.log4j.logger.name=com.east.testlfks.TestLogsFlume, flume.client.log4j.timestamp=1450419581934} body: 6D 79 20 77 6F 72 6B 20 69 73 20 69 74 20 20 69 my work is it i }
15/12/17 22:19:47 INFO sink.LoggerSink: Event: { headers:{flume.client.log4j.log.level=20000, flume.client.log4j.message.encoding=UTF8, flume.client.log4j.logger.name=com.east.testlfks.TestLogsFlume, flume.client.log4j.timestamp=1450419587940} body: 6D 79 20 6E 61 6D 65 20 69 73 20 6C 69 75 20 78 my name is liu x }
15/12/17 22:19:47 INFO sink.LoggerSink: Event: { headers:{flume.client.log4j.log.level=40000, flume.client.log4j.message.encoding=UTF8, flume.client.log4j.logger.name=com.east.testlfks.TestLogsFlume, flume.client.log4j.timestamp=1450419587943} body: 6D 79 20 77 6F 72 6B 20 69 73 20 69 74 20 20 69 my work is it i }
log4j和flume整合成功。第二步 整合flume和kafka
(1) 我使用的flume版本 是Flume 1.6.0,在1.6版本开始可以直接配置文件来实现flume的sink给kafka传送消息,不需要再自定义kafkasink。
[root@hadoop0 bin]# flume-ng version
Flume 1.6.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: 2561a23240a71ba20bf288c7c2cda88f443c2080
Compiled by hshreedharan on Mon May 11 11:15:44 PDT 2015
From source with checksum b29e416802ce9ece3269d34233baf43f
a1.channels = c1
a1.sources =s1
a1.sinks = k1
# 定义channel
a1.channels.c1.type = memory
# 定义source
a1.sources.s1.channels = c1
a1.sources.s1.type = avro
a1.sources.s1.bind = 0.0.0.0
a1.sources.s1.port = 41414
# 定义sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = mytest
a1.sinks.k1.brokerList = hadoop0:9092,hadoop1:9092,hadoop2:9092
a1.sinks.k1.requiredAcks = 1
a1.sinks.k1.batchSize = 20
a1.sinks.k1.channel = c1
(3)安装kafka集群,先启动zookeeper集群,然后在启动kafka集群,保证kafka集群是能正常运行的。
在各个节点(hadoop0,hadoop1,hadoop2)启动kafka集群
bin/kafka-server-start.sh config/server.properties &
(4)创建topic bin/kafka-topics.sh --create --zookeeper hadoop0:2181,hadoop1:2181,hadoop2:2181 --replication-factor 3 --partitions 6 --topic mytest
factor大小不能超过broker数
(5)通过以下命令查看主题topic
[root@hadoop1 kafka]# bin/kafka-topics.sh --list --zookeeper hadoop0:2181,hadoop1:2181,hadoop2:2181
mytest
(6)通过下述命令可以看到该主题详情
bin/kafka-topics.sh --describe --zookeeper hadoop0:2181,hadoop1:2181,hadoop2:2181 --topic mytest
Topic:mytest PartitionCount:6 ReplicationFactor:3 Configs:
Topic: mytest Partition: 0 Leader: 0 Replicas: 0,1,2 Isr: 0,1,2
Topic: mytest Partition: 1 Leader: 1 Replicas: 1,2,0 Isr: 1,2,0
Topic: mytest Partition: 2 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1
Topic: mytest Partition: 3 Leader: 0 Replicas: 0,2,1 Isr: 0,2,1
Topic: mytest Partition: 4 Leader: 1 Replicas: 1,0,2 Isr: 1,0,2
Topic: mytest Partition: 5 Leader: 2 Replicas: 2,1,0 Isr: 2,1,0
下面解释一下这些输出。第一行是对所有分区的一个描述,然后每个分区都会对应一行,因为我们只有一个分区所以下面就只加了一行。
- leader:负责处理消息的读和写,leader是从所有节点中随机选择的.
- replicas:列出了所有的副本节点,不管节点是否在服务中.
- isr:是正在服务中的节点.
(7)生产者生成消息和消费者消费消息
在hadoop0上建立生产者角色,并发送消息(其实可以是三台机子中的任何一台)
bin/kafka-console-producer.sh --broker-list hadoop1:9092 --topic mytest
this is a message
this is the second message
在hadoop2上建立消费者角色(在该终端窗口内可以看到生产者发布这消息)
[root@hadoop2 kafka]# bin/kafka-console-consumer.sh --zookeeper hadoop1:2181 --topic mytest --from-beginning
this is a message
this is the second message
kafka 集群安装成功,能正常运行。
(8)在flume,kafka均启动的情况下,运行LogsFlume类产生日志,再再hadoop2消费者窗口下是否接收到了日志产生的消息。
接受到消息,成功!
第三步 kafka与storm的整合
实现要把storm的jar依赖包写在pom.xml文件中(上面的pom.xml文件中已经把所有的依赖包写在里面了)
(1)建立Logprocess类
package com.east.testlfks;
import java.util.UUID;
import backtype.storm.Config;
import backtype.storm.LocalCluster;
import backtype.storm.StormSubmitter;
import backtype.storm.generated.AlreadyAliveException;
import backtype.storm.generated.InvalidTopologyException;
import backtype.storm.topology.TopologyBuilder;
import storm.kafka.BrokerHosts;
import storm.kafka.KafkaSpout;
import storm.kafka.SpoutConfig;
import storm.kafka.ZkHosts;
public class Logprocess {
public static void main(String[] args) throws AlreadyAliveException, InvalidTopologyException {
TopologyBuilder topologyBuilder = new TopologyBuilder();
// 表示kafka使用的zookeeper的地址
BrokerHosts hosts = new ZkHosts("hadoop0:2181,hadoop1:2181,hadoop2:2181");
// 表示的是kafak中存储数据的主题名称
String topic = "mytest";
// 指定zookeeper中的一个根目录,里面存储kafkaspout读取数据的位置等信息
String zkRoot = "/kafkaspout";
String id = UUID.randomUUID().toString();
SpoutConfig spoutConf = new SpoutConfig(hosts, topic, zkRoot, id);
String SPOUT_ID = KafkaSpout.class.getSimpleName();
String BOLT_ID = LogFilterBolt.class.getSimpleName();
topologyBuilder.setSpout(SPOUT_ID, new KafkaSpout(spoutConf));
topologyBuilder.setBolt(BOLT_ID, new LogFilterBolt()).shuffleGrouping(SPOUT_ID);
if (args != null && args.length > 0) {
StormSubmitter.submitTopology(Logprocess.class.getSimpleName(), new Config(),
topologyBuilder.createTopology());
} else {
LocalCluster localCluster = new LocalCluster();
localCluster.submitTopology(Logprocess.class.getSimpleName(), new Config(),
topologyBuilder.createTopology());
}
}
}
(2)建立LogFilterBolt类
package com.east.testlfks;
import java.util.Map;
import backtype.storm.task.OutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.base.BaseRichBolt;
import backtype.storm.tuple.Tuple;
public class LogFilterBolt extends BaseRichBolt {
private OutputCollector collector;
public void prepare(Map stormConf, TopologyContext context,
OutputCollector collector) {
this.collector = collector;
}
public void execute(Tuple input) {
try{
byte[] binaryByField = input.getBinaryByField("bytes");
String value = new String(binaryByField);
System.out.println(value);
this.collector.ack(input);
}catch(Exception e){
this.collector.fail(input);
}
}
public void declareOutputFields(OutputFieldsDeclarer declarer) {
}
}
先运行日志生成TestLogsFlume类,然后再运行Logprocess类,然后在控制台台上能看到日志消息输出
Task [1/1] Refreshing partition manager connections
Read partition info from zookeeper: GlobalPartitionInformation{partitionMap={0=hadoop1:9092, 1=hadoop2:9092, 2=hadoop0:9092, 3=hadoop1:9092, 4=hadoop2:9092, 5=hadoop0:9092}}
Task [1/1] assigned [Partition{host=hadoop1:9092, partition=0}, Partition{host=hadoop2:9092, partition=1}, Partition{host=hadoop0:9092, partition=2}, Partition{host=hadoop1:9092, partition=3}, Partition{host=hadoop2:9092, partition=4}, Partition{host=hadoop0:9092, partition=5}]
Task [1/1] Deleted partition managers: []
Task [1/1] New partition managers: []
Task [1/1] Finished refreshing
my name is liu xiang ke
my work is it i love my work.
my name is liu xiang ke
my work is it i love my work.
my name is liu xiang ke
my work is it i love my work.
my name is liu xiang ke
my work is it i love my work.
my name is liu xiang ke
my work is it i love my work.
my name is liu xiang ke
my work is it i love my work.
最后log4j+flume+kafka+storm整合成功。
参考链接:http://blog.csdn.net/jianghuxiaojin/article/details/51725347
0 0
- log4j+flume+kafka+strom整合
- Flume+kafka+storm整合
- Flume与Kafka整合
- Kafka flume 整合
- Kafka flume 整合
- Flume+Kafka+SparkStreaming整合
- flume+kafka+storm整合
- 103-flume整合kafka
- Flume+Kafka整合
- flume整合kafka
- Flume与Kafka整合
- Kafka整合Flume
- kafka和flume整合
- Flume整合kafka
- Flume+Kafka+SparkStreaming整合
- Flume和Kafka整合
- flume+kafka+storm整合
- Flume与Kafka整合
- 获取发布时间距离当前时间的时间
- python学习笔记(一)---python简介
- qtp vb正则表达式RegExp、随机函数参数化RandomNumber实现
- 转的,备忘下,Arraylist排序
- LeetCode Search in Rotated Sorted Array
- log4j+flume+kafka+strom整合
- An error occured starting Mathtype's Commonds for word(office32+mathType6.8)
- JPush实现app推送功能
- 给控件添加阴影
- NAL技术
- <php+mysql>PHP脚本条件判断,foreach循环,以及粘性表单
- 学习Android从0开始之开发工具篇-Android studio详解
- 乘积配对,找出输入数据中所有两两相乘的积为 12的个数--C++
- android 笔记(-)