12.kafka Producer Example

来源：互联网发布：阿里云网站备案编辑：程序博客网时间：2024/05/22 14:28

生产者

Producer类用于为特定主题和可选分区创建新消息。

如果使用Java，您需要为Producer和支持类包括几个包：

import kafka.javaapi.producer.Producer;
import kafka.producer.KeyedMessage;
import kafka.producer.ProducerConfig;

代码中的第一步是定义属性，以了解Producer如何找到集群，对消息进行序列化，并在适当时将消息定向到特定的分区。

这些属性在标准Java Properties对象中定义：

Properties props = new Properties();
 
props.put("metadata.broker.list", "broker1:9092,broker2:9092");
props.put("serializer.class", "kafka.serializer.StringEncoder");
props.put("partitioner.class", "example.producer.SimplePartitioner");
props.put("request.required.acks", "1");
 
ProducerConfig config = new ProducerConfig(props);

第一个属性“metadata.broker.list”定义了生产者可以在哪里找到一个或多个代理来确定每个主题的领导者。这不需要是集群中的所有Broker集合，但应该包括至少两个在第一个Broker不可用的情况下。无需担心找出哪个Broker是主题（和分区）的领导者，生产者知道如何连接到Broker并请求元数据，然后连接到正确的Broker。

第二个属性“serializer.class”定义了准备传递给代理的消息时要使用的Serializer。在我们的例子中，我们使用一个简单的String编码器作为Kafka的一部分。请注意，编码器必须接受与下一步中KeyedMessage对象中定义的类型相同的类型。

可以通过适当地定义“key.serializer.class”来更改消息的密钥的序列化程序（见下文）。默认情况下，它被设置为与“serializer.class”相同的值。

第三个属性“partitioner.class”定义了使用什么类来确定消息主题中要发送哪个分区。这是可选的，但对于任何非平凡的实现，您将要实现分区方案。更多关于此类的实现的信息。如果您包括键的值，但没有定义一个partitioner.class Kafka将使用默认分区程序。如果密钥为空，那么生产者将分配消息到随机分区。

最后一个属性“request.required.acks”告诉Kafka，您希望您的生产者要求Broker确认接收到该消息。没有这个设置，生产者将“发烧并忘记”，可能导致数据丢失。其他信息可以在这里找到

接下来你定义Producer对象本身：

Producer<String, String> producer = new Producer<String, String>(config);

注意，Producer是一个Java泛型，你需要告诉它两个参数的类型。第一种是分区键的类型，第二种是消息的类型。在这个例子中，他们都是字符串，这也匹配我们在上面的属性中定义。

现在创建您的消息：

Random rnd = new Random();
 
long runtime = new Date().getTime();
 
String ip = “192.168.2.” + rnd.nextInt(255);
 
String msg = runtime + “,www.example.com,” + ip;

In this example we are faking a message for a website visit by IP address. First part of the comma-separated message is the timestamp of the event, the second is the website and the third is the IP address of the requester. We use the Java Random class here to make the last octet of the IP vary so we can see how Partitioning works.

Finally write the message to the Broker:

KeyedMessage<String, String> data = new KeyedMessage<String, String>("page_visits", ip, msg);
 
producer.send(data);

The “page_visits” is the Topic to write to. Here we are passing the IP as the partition key. Note that if you do not include a key, even if you've defined a partitioner class, Kafka will assign the message to a random partition.

Full Source:

import java.util.*;
 
import kafka.javaapi.producer.Producer;
import kafka.producer.KeyedMessage;
import kafka.producer.ProducerConfig;
 
public class TestProducer {
    public static void main(String[] args) {
        long events = Long.parseLong(args[0]);
        Random rnd = new Random();
 
        Properties props = new Properties();
        props.put("metadata.broker.list", "broker1:9092,broker2:9092 ");
        props.put("serializer.class", "kafka.serializer.StringEncoder");
        props.put("partitioner.class", "example.producer.SimplePartitioner");
        props.put("request.required.acks", "1");
 
        ProducerConfig config = new ProducerConfig(props);
 
        Producer<String, String> producer = new Producer<String, String>(config);
 
        for (long nEvents = 0; nEvents < events; nEvents++) { 
               long runtime = new Date().getTime();  
               String ip = “192.168.2.” + rnd.nextInt(255); 
               String msg = runtime + “,www.example.com,” + ip; 
               KeyedMessage<String, String> data = new KeyedMessage<String, String>("page_visits", ip, msg);
               producer.send(data);
        }
        producer.close();
    }
}

Partitioning Code:

import kafka.producer.Partitioner;
import kafka.utils.VerifiableProperties;
 
public class SimplePartitioner implements Partitioner {
    public SimplePartitioner (VerifiableProperties props) {
 
    }
 
    public int partition(Object key, int a_numPartitions) {
        int partition = 0;
        String stringKey = (String) key;
        int offset = stringKey.lastIndexOf('.');
        if (offset > 0) {
           partition = Integer.parseInt( stringKey.substring(offset+1)) % a_numPartitions;
        }
       return partition;
  }
 
}

逻辑采用键，我们希望是IP地址，找到最后一个八位字节，并对Kafka中为该主题定义的分区数执行取模运算。此分区逻辑的好处是来自相同源IP的所有Web访问都在同一分区中。当然，其他IP也是如此，但是你的消费者逻辑将需要知道如何处理它。

运行此操作之前，请确保已创建主题page_visits。从命令行：

bin/kafka-create-topic.sh --topic page_visits --replica 3 --zookeeper localhost:2181 --partition 5

确保包含一个--partition选项，以便创建多个。

现在编译并运行你的Producer，数据将被写入Kafka。

要确认您有数据，请使用命令行工具查看已写入的内容：

bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic page_visits --from-beginning

Maven

<dependency>
  <groupId>org.apache.kafka</groupId>
  <artifactId>kafka_2.9.2</artifactId>
  <version>0.8.1.1</version>
  <scope>compile</scope>
  <exclusions>
    <exclusion>
      <artifactId>jmxri</artifactId>
      <groupId>com.sun.jmx</groupId>
    </exclusion>
    <exclusion>
      <artifactId>jms</artifactId>
      <groupId>javax.jms</groupId>
    </exclusion>
    <exclusion>
      <artifactId>jmxtools</artifactId>
      <groupId>com.sun.jdmk</groupId>
    </exclusion>
  </exclusions>
</dependency>

0 0