centos5.5_64bit部署kafka

来源:互联网 发布:xml与java 编辑:程序博客网 时间:2024/04/30 13:54

1.前提条件

    centos5.5_64bit vmware虚拟机,

    从kafka官网下载介质:http://kafka.apache.org/,我下载的是kafka_2.9.2-0.8.1.tgz.

    kafka需要依赖zookeeper环境,所以提前部署好zookeeper,关于它的部署有很多文章可以参考。

2.单点安装及验证

   2.1 解压介质包

    > tar -xzf kafka_2.9.2-0.8.1.tgz

    > cd kafka_2.9.2-0.8.1

 

   2.2 启动Server

     kafka用到了zookeeper,所以需要先启动zookeeper服务,如果没有部署zookeeper,可以使用kafka介质中的zookeeper。如下:

      > bin/zookeeper-server-start.sh config/zookeeper.properties[2013-04-22 15:01:37,495] INFO Reading configuration from: config/zookeeper.properties                org.apache.zookeeper.server.quorum.QuorumPeerConfig)...

 

      启动Kafka server:

   > bin/kafka-server-start.sh config/server.properties   [2013-04-22 15:01:47,028] INFO Verifying properties (kafka.utils.VerifiableProperties)   [2013-04-22 15:01:47,051] INFO Property socket.send.buffer.bytes is overridden to 1048576 (kafka.utils.VerifiableProperties)   ...

    2.3 创建topic

     下面创建一个topic,名称为"test" ,(a single partition and only one replica: )

   > bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test

 

      用下面的命令可以查看到刚才建立的topic:

   > bin/kafka-topics.sh --list --zookeeper localhost:2181     test

      当然,除了手工创建topic,也可以通过配置代理(brokers),让它来自动创建需要发布的不存在的topic。

 

       注意:在上面的kafka的发布包缺少slf4j-log4j12-1.7.2.jar,导致在上的命令会失败,所以需要把这个包放到  kafka根目录/libs 下面

      2.4 发送消息    

       kafka自带了一个命令行的客户端工具,可以通过标准输入或文件发消息发送到kafka集群,默认情况下,每一行被认为是一条消息

       运行该客户端,输入一下信息发送到Server

       > bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test

      This is a message

       This is another message

     2.5 接收消息

      kafka同时自带了一个命令行的消费者(consumer),可以把消息输出到标准输出。

   > bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning   This is a message   This is another message

 

     如果把这个上面的这些程序(生成者/消费者)运行到不同的机器上,则可以从生产者端输入消息,然后从消费者段看到你输入的信息。

 

     这些命令行工具都有可选项,执行不带参数的命令就可以显示工具的帮助信息。

3 伪分布式

     为每个broker创建配置文件

    > cp config/server.properties config/server-1.properties

    > cp config/server.properties config/server-2.properties

     修改配置文件如下:

    config/server-1.properties:       broker.id=1       port=9093       log.dir=/tmp/kafka-logs-1    config/server-2.properties:       broker.id=2       port=9094       log.dir=/tmp/kafka-logs-2

     在整个集群中broker.id必须是唯一的,而且每个节点的名称不可以改变。因为我们把brokers运行在同一台机器上,所以要防止端口冲突和日志覆盖。

     已经有一个broker在运行,下面启动其他两个broker:

   > bin/kafka-server-start.sh config/server-1.properties &   ...    > bin/kafka-server-start.sh config/server-2.properties &   ...

 

      下面创建一个新的topic,带有备份因子。:

   > bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 3 --partitions 1 --topic my-replicated-topic

 

      那么如何知道这些broker分别作了什么,用下面的命令:

   > bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic my-replicated-topic   Topic:my-replicated-topicPartitionCount:1ReplicationFactor:3Configs:      Topic: my-replicated-topicPartition: 0Leader: 1Replicas: 1,2,0Isr: 1,2,0

Here is an explanation of output. The first line gives a summary of all the partitions, each additional line gives information about one partition. Since we have only two partitions for this topic there are only two lines.

      上面输入的意义:第一行是所有partitions的概况,后面的每一行分别对应每个partition的信息,因为我们只有2个partitions,所以这里只有2行

  •   "leader" is the node responsible for all reads and writes for the given partition. Each node will be the leader for a randomly selected portion of the partitions.  
  •   "replicas" is the list of nodes that replicate the log for this partition regardless of whether they are the leader or even if they are currently alive.  
  •   "isr" is the set of "in-sync" replicas. This is the subset of the replicas list that is currently alive and caught-up to the leader.

           “leader” 负责对指定partition进行读写的节点,每个节点都可能通过随机选举成为 leader;

           “repilicas” 是 为本partition备份log的节点列表 ,不论这些partition是leader 或 甚至是活动状态;

           “isr” 是 “in-sync”状态的replicas的集合,isr应该是 replicas列表(处于活动状态而且能赶上leader)的子集,

     

     注意:在我的例子中,node1 是topic唯一partition的 leader。

    Note that in my example node 1 is the leader for the only partition of the topic.

    We can run the same command on the original topic we created to see where it is:

    > bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic testTopic:testPartitionCount:1ReplicationFactor:1Configs:Topic: testPartition: 0Leader: 0Replicas: 0Isr: 0

    So there is no surprise there—the original topic has no replicas and is on server 0, the only server in our cluster when we created it.

    Let's publish a few messages to our new topic:

    > bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic...my test message 1my test message 2^C 

    Now let's consume these messages:

    > bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic my-replicated-topic...my test message 1my test message 2^C

    Now let's test out fault-tolerance. Broker 1 was acting as the leader so let's kill it:

    > ps -ef | grep server-1.properties7564 ttys002    0:15.91 /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home/bin/java...> kill -9 7564

    Leadership has switched to one of the slaves and node 1 is no longer in the in-sync replica set:

    > bin/kafka-topics.sh --describe --zookeeper localhost:218192 --topic my-replicated-topicTopic:my-replicated-topicPartitionCount:1ReplicationFactor:3Configs:Topic: my-replicated-topicPartition: 0Leader: 2Replicas: 1,2,0Isr: 2,0

    But the messages are still be available for consumption even though the leader that took the writes originally is down:

    > bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic my-replicated-topic...my test message 1my test message 2^C

     

     

     

      

  • 0 0