Kafka精要

来源:互联网 发布:天心天思网络差评 编辑:程序博客网 时间:2024/05/16 15:18

Quickstart

Kafka 0.8.2.0

Install:

tar -xzf kafka_2.10-0.8.2.0.tgz -C /usr/localln -s /usr/local/kafka_2.10-0.8.2.0 /usr/local/kafka

Config:

vi /usr/local/kafka/config/server.propertiesadvertised.host.name=<public network ip, not use private network ip>

Clean old data if neccessary:

rm -rf /tmp/zookeeperrm -rf /tmp/kafka-logs

Run:

nohup /usr/local/kafka/bin/zookeeper-server-start.sh /usr/local/kafka/config/zookeeper.properties &nohup /usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties &

Check status:

$JAVA_HOME/bin/jps

Topic management:

/usr/local/kafka/bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test/usr/local/kafka/bin/kafka-topics.sh --list --zookeeper localhost:2181

Producer:

/usr/local/kafka/bin/kafka-console-producer.sh --broker-list <private/public network ip>:9092 --topic test

Consumer:

/usr/local/kafka/bin/kafka-console-consumer.sh --zookeeper <usually private network ip>:2181 --topic test --from-beginning

Architecture and Design

  • Offical documentation
  • Putting Apache Kafka To Use: A Practical Guide to Building a Stream Data Platform (Part 1)
  • Putting Apache Kafka To Use: A Practical Guide to Building a Stream Data Platform (Part 2)
  • The Log: What every software engineer should know about real-time data’s unifying abstraction

Clients

Producer API

  • new Java producer API
  • legacy Scala producer API

Consumer API

  • High Level Consumer API
  • Simple Consumer API

Kafka Hadoop Consumer API

linkedin camus

Performance Test and Tuning

Referneces:

  • https://cwiki.apache.org/confluence/display/KAFKA/Performance+testing
  • http://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines
  • http://liveramp.com/blog/kafka-0-8-producer-performance-2/

Cloudera Kafka Integration

  • Install CSD
    Installation of the CSD will add a new parcel repository to your Cloudera Manager configuration.
sudo mkdir -p /opt/cloudera/csdsudo cp KAFKA-1.2.0.jar /opt/cloudera/csd/sudo /opt/cloudera-manager/cm-5.3.0/etc/init.d/cloudera-scm-server restart
  • Log into the Cloudera Manager Admin Console and restart the Cloudera Management Service
  • Download, distribute, and activate the parcel (You do not need to restart the cluster after installing Kafka.)
  • add a service

kafka settings:

log.dirs = /var/local/kafka/data

Operations:

sudo kafka-topics --create --zookeeper zookeeper-server:2181 --topic activity --partitions 1 --replication-factor 1

References:

  • http://www.cloudera.com/content/cloudera/en/documentation/cloudera-kafka/latest/topics/kafka.html
  • Install Add-on Services with Custom Service Descriptor Files

Operations

Yahoo Kafka-Manager

References:

  • https://github.com/yahoo/kafka-manager
0 0