Kafka和Spring集成实践
来源:互联网 发布:java小型购物网站设计 编辑:程序博客网 时间:2024/04/28 15:49
- 安装Zookeeper
- 安装Kafka
- 创建一个Spring项目
- 使用Producer API发送消息到Kafka
- 使用Kafka High Level API接收消息
- 使用spring-integration-kafka发送消息
使用spring-integration-kafka接收消息
本文以单机的环境演示如何将Kafka和Spring集成。
单机的环境最容易搭建, 并且只需在自己的PC上运行即可, 不需要很多的硬件环境,便于学习。 况且,本文的目的不是搭建ZooKeeper的集群环境, 而是重点介绍Kafka和Spring的应用。
具体的软件环境如下:
OS: CentOS 6.4
Zookepper: zookeeper-3.4.6
Kafka: kafka_2.9.1-0.8.2-beta
Java: JDK 1.7.0_45-b18
Spring:4.0.6
本例子在我的这个环境中运行正常, 全部代码可以到 github 下载。
本文所有的操作系统用户都是root。 实际产品中可能安全标准需要特定的用户如zookeeper, kafka等。
安装Zookeeper
首先下载解压zookeeper,选择合适的镜像站点以加快下载速度。
我们可以将zookeeper加到系统服务中, 增加一个/etc/init.d/zookeeper文件。
cd /optwget http://apache.fayea.com/apache-mirror/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gztar zxvf zookeeper-3.4.6.tar.gzvi /etc/init.d/zookeeper
- 1
- 2
- 3
- 4
将https://raw.githubusercontent.com/apache/zookeeper/trunk/src/packages/rpm/init.d/zookeeper文件的内容拷贝到这个文件,修改其中的运行zookeeper的用户以及zookeeper的文件夹位置。
......start() { echo -n $"Starting $desc (zookeeper): " daemon --user root /opt/zookeeper-3.4.6/zkServer.sh start RETVAL=$? echo [ $RETVAL -eq 0 ] && touch /var/lock/subsys/zookeeper return $RETVAL}stop() { echo -n $"Stopping $desc (zookeeper): " daemon --user root /opt/zookeeper-3.4.6/zkServer.sh stop RETVAL=$? sleep 5 echo [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/zookeeper $PIDFILE}......
chmod 755 /etc/init.d/zookeeperservice zookeeper start
如果你不想加到服务,也可以直接运行zookeeper。
/opt/zookeeper-3.4.6/zkServer.sh start
安装Kafka
从合适的镜像站点下载最新的kafka并解压。
wget http://apache.01link.hk/kafka/0.8.2-beta/kafka_2.9.1-0.8.2-beta.tgztar zxvf kafka_2.9.1-0.8.2-beta.tgzcd kafka_2.9.1-0.8.2-beta
启动Kafka:
bin/kafka-server-start.sh config/server.properties
创建一个test的topic:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
可以利用kafka的命令启动一个生产者和消费者试验一下:
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic testThis is a messageThis is another message
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginningThis is a messageThis is another message
更多的介绍可以查看我翻译整理的 Kafka快速入门
创建一个Spring项目
以上的准备环境完成,让我们开始创建一个项目。
以前我写过一篇简单介绍: Spring 集成 Kafka.
spring-integration-kafka这个官方框架我就不介绍了。 我们主要使用它做集成。
首先我们先看一下使用Kafka自己的Producer/Consumer API发送/接收消息的例子。
使用Producer API发送消息到Kafka
OK,现在我们先看一个使用Kafka 自己的producer API发送消息的例子:
public class NativeProducer { public static void main(String[] args) { String topic= "test"; long events = 100; Random rand = new Random(); Properties props = new Properties(); props.put("metadata.broker.list", "localhost:9092"); props.put("serializer.class", "kafka.serializer.StringEncoder"); props.put("request.required.acks", "1"); ProducerConfig config = new ProducerConfig(props); Producer<String, String> producer = new Producer<String, String>(config); for (long nEvents = 0; nEvents < events; nEvents++) { String msg = "NativeMessage-" + rand.nextInt() ; KeyedMessage<String, String> data = new KeyedMessage<String, String>(topic, nEvents + "", msg); producer.send(data); } producer.close(); }}
这个例子中首先初始化Producer对象,指定相应的broker和serializer, 然后发送100个字符串消息给Kafka。
运行mvn package编译代码,执行查看结果:
java -cp target/lib/*:target/spring-kafka-demo-0.2.0-SNAPSHOT.jar com.colobu.spring_kafka_demo.NativeProducer
上面的消费者控制台窗口会打印出收到的消息:
......NativeMessage--1645592376NativeMessage-534168193NativeMessage--1899432197NativeMessage-1642480773NativeMessage--911267171NativeMessage-251458151NativeMessage--55710397NativeMessage-455515562NativeMessage-1108982916NativeMessage--1710296834NativeMessage-2102648373NativeMessage-499979365NativeMessage--1200107003NativeMessage-1184836299NativeMessage--1161123005NativeMessage-912582115NativeMessage--1557863408NativeMessage--1036456356......
使用Kafka High Level API接收消息
用High level Consumer API接收消息
import java.util.HashMap;import java.util.List;import java.util.Map;import java.util.Properties;import java.util.concurrent.ExecutorService;import java.util.concurrent.Executors;import kafka.consumer.ConsumerConfig;import kafka.consumer.ConsumerIterator;import kafka.consumer.KafkaStream;import kafka.javaapi.consumer.ConsumerConnector;public class NativeConsumer { private final ConsumerConnector consumer; private final String topic; private ExecutorService executor; public NativeConsumer(String a_zookeeper, String a_groupId, String a_topic) { consumer = kafka.consumer.Consumer.createJavaConsumerConnector(createConsumerConfig(a_zookeeper, a_groupId)); this.topic = a_topic; } public void shutdown() { if (consumer != null) consumer.shutdown(); if (executor != null) executor.shutdown(); } public void run(int a_numThreads) { Map<String, Integer> topicCountMap = new HashMap<String, Integer>(); topicCountMap.put(topic, new Integer(a_numThreads)); Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap); List<KafkaStream<byte[], byte[]>> streams = consumerMap.get(topic); // now launch all the threads // executor = Executors.newFixedThreadPool(a_numThreads); // now create an object to consume the messages // int threadNumber = 0; for (final KafkaStream stream : streams) { executor.submit(new ConsumerTest(stream, threadNumber)); threadNumber++; } } private static ConsumerConfig createConsumerConfig(String a_zookeeper, String a_groupId) { Properties props = new Properties(); props.put("zookeeper.connect", a_zookeeper); props.put("group.id", a_groupId); props.put("zookeeper.session.timeout.ms", "400"); props.put("zookeeper.sync.time.ms", "200"); props.put("auto.commit.interval.ms", "1000"); return new ConsumerConfig(props); } public static void main(String[] args) { String zooKeeper = "localhost:2181"; String groupId = "mygroup"; String topic = "test"; int threads = 1; NativeConsumer example = new NativeConsumer(zooKeeper, groupId, topic); example.run(threads); try { Thread.sleep(10000); } catch (InterruptedException ie) { } //example.shutdown(); }}class ConsumerTest implements Runnable { private KafkaStream m_stream; private int m_threadNumber; public ConsumerTest(KafkaStream a_stream, int a_threadNumber) { m_threadNumber = a_threadNumber; m_stream = a_stream; } public void run() { ConsumerIterator<byte[], byte[]> it = m_stream.iterator(); while (it.hasNext()) System.out.println("Thread " + m_threadNumber + ": " + new String(it.next().message())); System.out.println("Shutting down Thread: " + m_threadNumber); }}
在生产者控制台输入几条消息,可以看到运行这个例子的控制台可以将这些消息打印出来。
教程的代码中还包括一个使用Simple Consumer API接收消息的例子。 因为spring-integration-kafka不支持这种API,这里也不列出对比代码了。
使用spring-integration-kafka发送消息
Outbound Channel Adapter用来发送消息到Kafka。 消息从Spring Integration Channel中读取。 你可以在Spring application context指定这个channel。
一旦配置好这个Channel,就可以利用这个Channel往Kafka发消息。 明显地,Spring Integration特定的消息发送给这个Adaptor,然后发送前在内部被转为Kafka消息。当前的版本要求你必须指定消息key和topic作为头部数据 (header),消息作为有载荷(payload)。
例如
final MessageChannel channel = ctx.getBean("inputToKafka", MessageChannel.class);channel.send( MessageBuilder.withPayload(payload) //设置有效载荷 .setHeader("messageKey", "key") //指定key .setHeader("topic", "test").build()); /指定topic/
实际代码如下:
import java.util.Random;import org.springframework.context.support.ClassPathXmlApplicationContext;import org.springframework.integration.support.MessageBuilder;import org.springframework.messaging.MessageChannel;public class Producer { private static final String CONFIG = "/context.xml"; private static Random rand = new Random(); public static void main(String[] args) { final ClassPathXmlApplicationContext ctx = new ClassPathXmlApplicationContext(CONFIG, Producer.class); ctx.start(); final MessageChannel channel = ctx.getBean("inputToKafka", MessageChannel.class); for (int i = 0; i < 100; i++) { channel.send(MessageBuilder.withPayload("Message-" + rand.nextInt()).setHeader("messageKey", String.valueOf(i)).setHeader("topic", "test").build()); } try { Thread.sleep(100000); } catch (InterruptedException e) { e.printStackTrace(); } ctx.close(); }}
Spring 配置文件:
<?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:int="http://www.springframework.org/schema/integration" xmlns:int-kafka="http://www.springframework.org/schema/integration/kafka" xmlns:task="http://www.springframework.org/schema/task" xsi:schemaLocation="http://www.springframework.org/schema/integration/kafka http://www.springframework.org/schema/integration/kafka/spring-integration-kafka.xsd http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task.xsd"> <int:channel id="inputToKafka"> <int:queue/> </int:channel> <int-kafka:outbound-channel-adapter id="kafkaOutboundChannelAdapter" kafka-producer-context-ref="kafkaProducerContext" auto-startup="false" channel="inputToKafka" order="3" > <int:poller fixed-delay="1000" time-unit="MILLISECONDS" receive-timeout="0" task-executor="taskExecutor"/> </int-kafka:outbound-channel-adapter> <task:executor id="taskExecutor" pool-size="5" keep-alive="120" queue-capacity="500"/> <bean id="producerProperties" class="org.springframework.beans.factory.config.PropertiesFactoryBean"> <property name="properties"> <props> <prop key="topic.metadata.refresh.interval.ms">3600000</prop> <prop key="message.send.max.retries">5</prop> <prop key="serializer.class">kafka.serializer.StringEncoder</prop> <prop key="request.required.acks">1</prop> </props> </property> </bean> <int-kafka:producer-context id="kafkaProducerContext" producer-properties="producerProperties"> <int-kafka:producer-configurations> <int-kafka:producer-configuration broker-list="localhost:9092" topic="test" compression-codec="default"/> </int-kafka:producer-configurations> </int-kafka:producer-context></beans>
int:channel是配置Spring Integration Channel, 此channel基于queue。
int-kafka:outbound-channel-adapter是outbound-channel-adapter对象, 内部使用一个线程池处理消息。关键是kafka-producer-context-ref。
int-kafka:producer-context配置producer列表,要处理的topic,这些Producer最终要转换成Kafka的Producer。
producer的配置参数如下:
broker-list List of comma separated brokers that this producer connects totopic Topic name or Java regex pattern of topic namecompression-codec Compression method to be used. Default is no compression. Supported compression codec are gzip and snappy. Anything else would result in no compressionvalue-encoder Serializer to be used for encoding messages.key-encoder Serializer to be used for encoding the partition keykey-class-type Type of the key class. This will be ignored if no key-encoder is providedvalue-class-type Type of the value class. This will be ignored if no value-encoder is provided.partitioner Custom implementation of a Kafka Partitioner interface.async True/False - default is false. Setting this to true would make the Kafka producer to use an async producerbatch-num-messages Number of messages to batch at the producer. If async is false, then this has no effect.
value-encoder 和key-encoder可以是其它实现了Kafka Encoder接口的Bean。同样partitioner也是实现了Kafka的Partitioner接口的Bean。
一个Encoder的例子:
<bean id="kafkaEncoder" class="org.springframework.integration.kafka.serializer.avro.AvroSpecificDatumBackedKafkaEncoder"> <constructor-arg value="com.company.AvroGeneratedSpecificRecord" /></bean>
Spring Integration Kafka 也提供了个基于Avro的Encoder。 Avro也是Apache的一个项目, 在大数据处理时也是一个常用的序列化框架。
不指定Encoder将使用Kafka缺省的Encoder (kafka.serializer.DefaultEncoder, byte[] –> same byte[])。
producerProperties可以用来设置配置属性进行调优。配置属性列表请参考 http://kafka.apache.org/documentation.html#producerconfigs
使用spring-integration-kafka接收消息
同样的原理实现一个消费者:
package com.colobu.spring_kafka_demo;import java.util.Collection;import java.util.HashMap;import java.util.Iterator;import java.util.List;import java.util.Map;import java.util.Random;import java.util.Set;import java.util.concurrent.ConcurrentHashMap;import org.slf4j.LoggerFactory;import org.springframework.context.support.ClassPathXmlApplicationContext;import org.springframework.integration.channel.QueueChannel;import org.springframework.messaging.Message;import ch.qos.logback.classic.Level;public class Consumer { private static final String CONFIG = "/consumer_context.xml"; private static Random rand = new Random(); @SuppressWarnings({ "unchecked", "unchecked", "rawtypes" }) public static void main(String[] args) { ch.qos.logback.classic.Logger rootLogger = (ch.qos.logback.classic.Logger)LoggerFactory.getLogger(ch.qos.logback.classic.Logger.ROOT_LOGGER_NAME); rootLogger.setLevel(Level.toLevel("info")); final ClassPathXmlApplicationContext ctx = new ClassPathXmlApplicationContext(CONFIG, Consumer.class); ctx.start(); final QueueChannel channel = ctx.getBean("inputFromKafka", QueueChannel.class); Message msg; while((msg = channel.receive()) != null) { HashMap map = (HashMap)msg.getPayload(); Set<Map.Entry> set = map.entrySet(); for (Map.Entry entry : set) { String topic = (String)entry.getKey(); System.out.println("Topic:" + topic); ConcurrentHashMap<Integer,List<byte[]>> messages = (ConcurrentHashMap<Integer,List<byte[]>>)entry.getValue(); Collection<List<byte[]>> values = messages.values(); for (Iterator<List<byte[]>> iterator = values.iterator(); iterator.hasNext();) { List<byte[]> list = iterator.next(); for (byte[] object : list) { String message = new String(object); System.out.println("\tMessage: " + message); } } } } try { Thread.sleep(100000); } catch (InterruptedException e) { e.printStackTrace(); } ctx.close(); }}
Spring的配置文件如下:
<?xml version="1.0" encoding="UTF-8"?><beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:int="http://www.springframework.org/schema/integration" xmlns:int-kafka="http://www.springframework.org/schema/integration/kafka" xmlns:task="http://www.springframework.org/schema/task" xsi:schemaLocation="http://www.springframework.org/schema/integration/kafka http://www.springframework.org/schema/integration/kafka/spring-integration-kafka.xsd http://www.springframework.org/schema/integration http://www.springframework.org/schema/integration/spring-integration.xsd http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/task http://www.springframework.org/schema/task/spring-task.xsd"> <int:channel id="inputFromKafka"> <int:queue/> </int:channel> <int-kafka:inbound-channel-adapter id="kafkaInboundChannelAdapter" kafka-consumer-context-ref="consumerContext" auto-startup="false" channel="inputFromKafka"> <int:poller fixed-delay="10" time-unit="MILLISECONDS" max-messages-per-poll="5" /> </int-kafka:inbound-channel-adapter> <bean id="consumerProperties" class="org.springframework.beans.factory.config.PropertiesFactoryBean"> <property name="properties"> <props> <prop key="auto.offset.reset">smallest</prop> <prop key="socket.receive.buffer.bytes">10485760</prop> <!-- 10M --> <prop key="fetch.message.max.bytes">5242880</prop> <prop key="auto.commit.interval.ms">1000</prop> </props> </property> </bean> <int-kafka:consumer-context id="consumerContext" consumer-timeout="4000" zookeeper-connect="zookeeperConnect" consumer-properties="consumerProperties"> <int-kafka:consumer-configurations> <int-kafka:consumer-configuration group-id="mygroup" max-messages="5000"> <int-kafka:topic id="test" streams="4" /> </int-kafka:consumer-configuration> <!-- <int-kafka:consumer-configuration group-id="default3" value-decoder="kafkaSpecificDecoder" key-decoder="kafkaReflectionDecoder" max-messages="10"> <int-kafka:topic-filter pattern="regextopic.*" streams="4" exclude="false" /> </int-kafka:consumer-configuration> --> </int-kafka:consumer-configurations> </int-kafka:consumer-context> <int-kafka:zookeeper-connect id="zookeeperConnect" zk-connect="localhost:2181" zk-connection-timeout="6000" zk-session-timeout="400" zk-sync-time="200" /></beans>
这个配置和Producer类似, 同样声明一个channel, 定义inbound-channel-adapter, 它引用Bean kafka-consumer-context,
kafka-consumer-context定义了消费者的列表。 consumer-configuration还提供了topic-filter,使用正则表达式建立白名单或者黑名单(exclude属性)。
消费者上下文还需要zookeeper-connect。
由于spring-integration-kafka只实现了high level Consumer API,这也就意味着你不可能回滚重新查看以前的消息, 因为high level API不提供offset管理。
注意Channel中得到的有效负载的类型是:
Map
- Kafka和Spring集成实践
- Kafka和Spring集成实践
- Kafka和Spring集成实践
- Kafka和Spring集成实践
- Spring和Kafka集成
- spring集成kafka实现producer和consumer
- spring集成kafka实现producer和consumer
- spring 集成 kafka
- Spring kafka Integration集成
- Spring 集成Kafka(完整版)
- Kafka spring 集成
- Spring Cloud 集成 kafka
- spring-kafka集成介绍
- Spring Boot集成Kafka
- CDH集成kafka方法实践
- 集成kafka和flume
- Spring 与 Kafka集成实战
- spring boot与kafka集成
- kali安装搜狗输入法
- kali 配置更新源
- 【deeplearning.ai笔记第二课】1.3 机器学习基本方法(Basic recipe for machine learning)
- 【deeplearning.ai笔记第二课】1.4 正则化,权重初始化和输入归一化
- kali 截图工具Scrot的安装及使用
- Kafka和Spring集成实践
- kali安装网易云音乐
- kali linux 下载百度云大文件
- linux 卸载vmware
- IIS6.0+Tomcat共用80端口
- 【尚未解决】关于webdriver中浏览器从页面回退后通过id/xpath取得元素出错的问题
- linux 压缩及解压缩 命令
- debian 安装zah
- 关于c语言操作符