kafka搭建

来源:互联网 发布:webmethod 返回json 编辑:程序博客网 时间:2024/04/30 07:14

Kafka搭建

官网:http://kafka.apache.org

一、简介:Kafkais used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of com        特点:分布式、streaming、瞬时、安全(将数据保存为本地文件、分区:多线程并发、副本机制)、高吞吐量、多订阅

二、安装部署kafka(参见官网)
环境需求
JAVA
zookeeper
SCALA
kafka

1、安装zookeeper(存储所有topic的信息)

$ tar -zxvf zookeeper-3.4.5-cdh5.3.6.tar.gz

$ mkdir zkData(在zookeeper-3.4.5-cdh5.3.6目录下)

$ touch myid(在zkData目录下)

修改zoo.cfg
dataDir=/opt/modules/cdh-5.3.6/zookeeper-3.4.5-cdh5.3.6/zkData
server.1=rainbow.com.cn.com:2888:3888
zkData/myid(里面写1就okla)

zookeeper安装具体详见:

http://blog.csdn.net/CandySleep/article/details/52966251 (连接为分布式zookeeper安装,伪分布式类似)
启动zookeeper
$ bin/zkServer.sh start
$ bin/zkCli.sh

2、安装scala

1、解压 : $ tar -zxvf scala-2.10.4.tgz -C /opt/modules/cdh5.3.6/
2、配置环境变量 : # vi /etc/profile

#SCALA_HOME
export SCALA_HOME= /opt/modules/cdh5.3.6/scala-2.10.4
export PATH=$PATH:$SCALA_HOME/bin

# source /etc/profile

3、安装kafka

(1)解压

$ tar -zxvf kafka_2.10-0.8.2.1.tgz -C /opt/modules/cdh5.3.6/

   (2)修改kafka配置文件(server.properties)

broker.id=0

 # Hostname the broker will bind to. If not set, the server will bind to all interfaces          host.name=rainbow.com.cn

$ mkdir kafkaData(数据存放目录)

 # A comma seperated list of directories under which to store log files      log.dirs=/opt/modules/cdh-5.3.6/kafka_2.10-0.8.2.1/kafkadata

num.partitions(分区默认是1)

zookerper.connect=rainbow.com.cn:2181(不用kafka自带的zookeeper,使用自己的zookerper)

 # root directory for all kafka znodes.    zookeeper.connect=rainbow.com.cn:2181

    (3)先启动zookerper

$./zkServer.sh start

    再启动kafkaserver

$ bin/kafka-server-start.sh config/server.properties(多个server.propertiesbroker.id不同,可以启动多个)

后台启动命令(-daemon):$ bin/kafka-server-start.sh -daemon config/server.properties

查看zk(客户端  文件)

 

5、创建topic

$ ./kafka-topics.sh --create --zookeeper rainbow.com.cn:2181 --replication-factor 1 --partitions 1 --topic test(replication副本)

 查看topic

bin/kafka-topic.ssh --list --zookeeper rainbow.co.cn:2181 (topic数据放在zookeeper)


6、创建生产者

bin/kafka-console-producer.sh --broker-list rainbow.com.cn:9092 --topic test


7、创建消费者

bin/kafka-console-customer.sh  --zookeeper rainbow.com.cn:2181 --topic test --from beginning (链接zookeepertopic再从zookeeper上读取数据)


8、Kafaka集成

修改配置文件(kafka.properties)

a1.sources = s1
a1.channels = c1
a1.sinks = k1

# define source
a1.sources.s1.type = exec
a1.sources.s1.command = tail -F /opt/modules/cdh5.3.6/hive-0.13.1-cdh5.3.6/logs/hive.log
a1.sources.s1.shell = /bin/sh -c
#define channel
a1.channels.c1.type = file
a1.channels.c1.checkpointDir = /mnt/flume/checkpoint
a1.channels.c1.dataDirs = /mnt/flume/data
#define  sinks
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.brokerList = rainbow.com.cn:9092
a1.sinks.k1.topic = testTopic
# combination
a1.sources.s1.channels = c1
a1.sinks.k1.channel = c1

运行flume文件

bin/flume-ng agent --conf  conf/  --name a1  --conf-file  conf/kafka.properties -Dflume.root.logger=INFO,console

运行hive


查看flume执行情况:


okokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokokoko

1 0
原创粉丝点击