kafka+storm集成并运行demo-单机
来源:互联网 发布:移动4g是什么网络 编辑:程序博客网 时间:2024/06/07 04:42
1,安装环境jdk1.7,kafka_2.9.2-0.8.1.1.tgz,zookeeper-3.3.6.tar.gz,apache-storm-0.9.2-incubating.tar.gz
2,安装kafka:
I,配置kafka/conf/server.properties
# The id of the broker. This must be set to a unique integer for each broker.broker.id=0############################# Socket Server Settings ############################## The port the socket server listens onport=9092# Hostname the broker will bind to. If not set, the server will bind to all interfaceshost.name=node# Hostname the broker will advertise to producers and consumers. If not set, it uses the# value for "host.name" if configured. Otherwise, it will use the value returned from# java.net.InetAddress.getCanonicalHostName().advertised.host.name=node# The port to publish to ZooKeeper for clients to use. If this is not set,# it will publish the same port that the broker binds to.#advertised.port=<port accessible by clients># The number of threads handling network requestsnum.network.threads=2# The number of threads doing disk I/Onum.io.threads=8# The send buffer (SO_SNDBUF) used by the socket serversocket.send.buffer.bytes=1048576# The receive buffer (SO_RCVBUF) used by the socket serversocket.receive.buffer.bytes=1048576# The maximum size of a request that the socket server will accept (protection against OOM)socket.request.max.bytes=104857600############################# Log Basics ############################## A comma seperated list of directories under which to store log fileslog.dirs=/Users/eleme/ruson/tmp/kafka-logs# The default number of log partitions per topic. More partitions allow greater# parallelism for consumption, but this will also result in more files across# the brokers.num.partitions=2# The maximum size of a log segment file. When this size is reached a new log segment will be created.log.segment.bytes=536870912# The interval at which log segments are checked to see if they can be deleted according# to the retention policieslog.retention.check.interval.ms=60000# By default the log cleaner is disabled and the log retention policy will default to just delete segments after their retention expires.# If log.cleaner.enable=true is set the cleaner will be enabled and individual logs can then be marked for log compaction.log.cleaner.enable=false############################# Zookeeper ############################## Zookeeper connection string (see zookeeper docs for details).# This is a comma separated host:port pairs, each corresponding to a zk# server. e.g. "127.0.0.1:3000,127.0.0.1:3001,127.0.0.1:3002".# You can also append an optional chroot string to the urls to specify the# root directory for all kafka znodes.zookeeper.connect=node:2181# Timeout in ms for connecting to zookeeperzookeeper.connection.timeout.ms=1000000
II,修改kafka/conf/zookeeper.properties
# Licensed to the Apache Software Foundation (ASF) under one or more# contributor license agreements. See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License. You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.# the directory where the snapshot is stored.dataDir=/Users/eleme/ruson/tmp/zookeeper# the port at which the clients will connectclientPort=2181# disable the per-ip limit on the number of connections since this is a non-production configmaxClientCnxns=0III,启动kafka是否成功:
a,启动zookepper:bin/zookeeper-server-start.sh config/zookeeper.properties &
b,启动kafka:bin/kafka-server-start.sh config/server.properties &
c,查看是否启动成功:jps
有如下两个进程说明启动成功:
1942 Kafka
1845 QuorumPeerMain
d,在一个窗口运行produce:
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
e,打开另一个窗口运行consumer
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
3,安装storm:I,修改conf/storm.yaml文件
########### These MUST be filled in for a storm configuration storm.zookeeper.servers: - "node"# - "server2"# nimbus.host: "node" storm.local.dir: "/Users/eleme/ruson/tmp/storm" supervisor.slots.ports: - 6700 - 6701 - 6702 - 6703### ##### These may optionally be filled in:### List of custom serializations# topology.kryo.register:# - org.mycompany.MyType# - org.mycompany.MyType2: org.mycompany.MyType2Serializer
II,运行storm之前,要单独安装zookeeper
修改
/Users/eleme/ruson/zookeeper-3.3.6/conf/zoo.cfg文件
# The number of milliseconds of each ticktickTime=2000# The number of ticks that the initial# synchronization phase can takeinitLimit=10# The number of ticks that can pass between# sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.dataDir=/Users/eleme/ruson/tmp/zookeeperlogDir=Users/eleme/ruson/tmp/zookeeper/log.log# the port at which the clients will connectclientPort=2181server.1=127.0.0.1:2888:3888
III,验证storm是否成功:
1,启动zookepper:
./zookeeper-3.3.6/bin/zkServer.sh start
2,分别启动:./storm nimbus
./storm ui
./storm supervisor
4,在运行kafka和storm集成的例子之前,先分别启动
./zookeeper-3.3.6/bin/zkServer.sh start
./kafka_2.9.2-0.8.1.1/bin/kafka-server-start.sh kafka_2.9.2-0.8.1.1/config/server.properties
./apache-storm-0.9.2-incubating/bin/storm nimbus
./apache-storm-0.9.2-incubating/bin/storm ui
./apache-storm-0.9.2-incubating/bin/storm supervisor
java code:
KafkaSpouttest.java文件
package com.test.stormkafka;import java.text.SimpleDateFormat;import java.util.Date;import java.util.HashMap;import java.util.List;import java.util.Map;import java.util.Properties;import kafka.consumer.ConsumerConfig;import kafka.consumer.ConsumerIterator;import kafka.consumer.KafkaStream;import kafka.javaapi.consumer.ConsumerConnector;import backtype.storm.spout.SpoutOutputCollector;import backtype.storm.task.TopologyContext;import backtype.storm.topology.IRichSpout;import backtype.storm.topology.OutputFieldsDeclarer;import backtype.storm.tuple.Fields;import backtype.storm.tuple.Values; /** * demo1 * @author eleme * */public class KafkaSpouttest implements IRichSpout { private SpoutOutputCollector collector; private ConsumerConnector consumer; private String topic; public KafkaSpouttest() { } public KafkaSpouttest(String topic) { this.topic = topic; } public void nextTuple() { } public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) { this.collector = collector; } public void ack(Object msgId) { } public void activate() { consumer =kafka.consumer.Consumer.createJavaConsumerConnector(createConsumerConfig()); Map<String,Integer> topickMap = new HashMap<String, Integer>(); topickMap.put(topic, 1); System.out.println("*********Results********topic:"+topic); Map<String, List<KafkaStream<byte[],byte[]>>> streamMap=consumer.createMessageStreams(topickMap); KafkaStream<byte[],byte[]>stream = streamMap.get(topic).get(0); ConsumerIterator<byte[],byte[]> it =stream.iterator(); while(it.hasNext()){ String value =new String(it.next().message()); SimpleDateFormat formatter = new SimpleDateFormat ("yyyy年MM月dd日 HH:mm:ss SSS"); Date curDate = new Date(System.currentTimeMillis());//获取当前时间 String str = formatter.format(curDate); System.out.println("storm接收到来自kafka的消息------->" + value); collector.emit(new Values(value,1,str), value); } } private static ConsumerConfig createConsumerConfig() { Properties props = new Properties(); // 设置zookeeper的链接地址// props.put("zookeeper.connect","m1:2181,m2:2181,s1:2181,s2:2181"); // props.put("zookeeper.connect","192.168.101.23:2181"); props.put("zookeeper.connect","node:2181"); // 设置group id props.put("group.id", "1"); // kafka的group 消费记录是保存在zookeeper上的, 但这个信息在zookeeper上不是实时更新的, 需要有个间隔时间更新 props.put("auto.commit.interval.ms", "1000"); props.put("zookeeper.session.timeout.ms","10000"); return new ConsumerConfig(props); } public void close() { } public void deactivate() { } public void fail(Object msgId) { } public void declareOutputFields(OutputFieldsDeclarer declarer) { declarer.declare(new Fields("word","id","time")); } public Map<String, Object> getComponentConfiguration() { System.out.println("getComponentConfiguration被调用"); topic="idoall_testTopic"; return null; }}
KafkaTopologytest.java文件:
package com.test.stormkafka;import java.util.HashMap;import java.util.Map;import backtype.storm.Config;import backtype.storm.LocalCluster;import backtype.storm.topology.BasicOutputCollector;import backtype.storm.topology.OutputFieldsDeclarer;import backtype.storm.topology.TopologyBuilder;import backtype.storm.topology.base.BaseBasicBolt;import backtype.storm.tuple.Fields;import backtype.storm.tuple.Tuple;import backtype.storm.tuple.Values;import backtype.storm.utils.Utils;/** * demo1 * @author eleme * */public class KafkaTopologytest {public static void main(String[] args) {TopologyBuilder builder = new TopologyBuilder();builder.setSpout("spout", new KafkaSpouttest(""), 1);builder.setBolt("bolt1", new Bolt1(), 2).shuffleGrouping("spout");builder.setBolt("bolt2", new Bolt2(), 2).fieldsGrouping("bolt1",new Fields("word"));Map conf = new HashMap();conf.put(Config.TOPOLOGY_WORKERS, 1);conf.put(Config.TOPOLOGY_DEBUG, true);LocalCluster cluster = new LocalCluster();cluster.submitTopology("my-flume-kafka-storm-topology-integration",conf, builder.createTopology());Utils.sleep(1000 * 60 * 5); // local cluster test ...cluster.shutdown();}public static class Bolt1 extends BaseBasicBolt {public void execute(Tuple input, BasicOutputCollector collector) {try {String msg = input.getString(0);int id = input.getInteger(1);String time = input.getString(2);msg = msg + "bolt1";System.out.println("对消息加工第1次-------[arg0]:" + msg+ "---[arg1]:" + id + "---[arg2]:" + time + "------->"+ msg);if (msg != null) {collector.emit(new Values(msg));}} catch (Exception e) {e.printStackTrace();}}public void declareOutputFields(OutputFieldsDeclarer declarer) {declarer.declare(new Fields("word"));}}public static class Bolt2 extends BaseBasicBolt {Map<String, Integer> counts = new HashMap<String, Integer>();public void execute(Tuple tuple, BasicOutputCollector collector) {String msg = tuple.getString(0);msg = msg + "bolt2";System.out.println("对消息加工第2次---------->" + msg);collector.emit(new Values(msg, 1));}public void declareOutputFields(OutputFieldsDeclarer declarer) {declarer.declare(new Fields("word", "count"));}}}
pom.xml文件:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.test</groupId> <artifactId>stormkafka</artifactId> <version>0.0.1-SNAPSHOT</version> <packaging>jar</packaging> <name>stormkafka</name> <url>http://maven.apache.org</url> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> </properties> <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>3.8.1</version> <scope>test</scope> </dependency> <dependency> <groupId>org.apache.storm</groupId> <artifactId>storm-core</artifactId> <version>0.9.2-incubating</version> <scope>provided</scope> </dependency> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka_2.9.2</artifactId> <version>0.8.1.1</version> <exclusions> <exclusion> <groupId>org.apache.zookeeper</groupId> <artifactId>zookeeper</artifactId> </exclusion> <exclusion> <groupId>log4j</groupId> <artifactId>log4j</artifactId> </exclusion> </exclusions> </dependency> <dependency> <groupId>org.apache.storm</groupId> <artifactId>storm-kafka</artifactId> <version>0.9.2-incubating</version> </dependency> </dependencies> <build> <plugins> <plugin> <artifactId>maven-assembly-plugin</artifactId> <version>2.4</version> <configuration> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> </configuration> <executions> <execution> <id>make-assembly</id> <phase>package</phase> <goals> <goal>single</goal> </goals> </execution> </executions> </plugin> </plugins> </build> </project>
1,运行main方法:
2,运行kafka的producer
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic idoall_testTopic
在shell命令行输入:
=======
elemedeMacBook-Pro:ruson eleme$ ./kafka_2.9.2-0.8.1.1/bin/kafka-console-producer.sh --broker-list node:9092 --topic testTopic
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
[2014-12-14 09:58:13,005] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor)
========
[2014-12-14 09:58:23,665] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor)
[2014-12-14 09:59:25,795] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor)
=======
[2014-12-14 10:00:49,132] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor)
[2014-12-14 10:01:04,765] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor)
[2014-12-14 10:01:22,591] INFO Closing socket connection to /127.0.0.1. (kafka.network.Processor)
在eclipse控制台显示如下:
对消息加工第1次-------[arg0]:bolt1---[arg1]:1---[arg2]:2014年12月14日 09:58:13 144------->bolt133664 [Thread-8-bolt1] INFO backtype.storm.daemon.task - Emitting: bolt1 default [bolt1]33664 [Thread-20-__acker] INFO backtype.storm.daemon.executor - Processing received message source: spout:6, stream: __ack_init, id: {}, [-5044490574571788562 -6834576169241042166 6]33665 [Thread-8-bolt1] INFO backtype.storm.daemon.task - Emitting: bolt1 __ack_ack [-5044490574571788562 -8030044259304607564]33665 [Thread-14-bolt2] INFO backtype.storm.daemon.executor - Processing received message source: bolt1:2, stream: default, id: {-5044490574571788562=3578453389181342654}, [bolt1]对消息加工第2次---------->bolt1bolt233665 [Thread-20-__acker] INFO backtype.storm.daemon.executor - Processing received message source: bolt1:2, stream: __ack_ack, id: {}, [-5044490574571788562 -8030044259304607564]33665 [Thread-14-bolt2] INFO backtype.storm.daemon.task - Emitting: bolt2 default [bolt1bolt2, 1]33666 [Thread-14-bolt2] INFO backtype.storm.daemon.task - Emitting: bolt2 __ack_ack [-5044490574571788562 3578453389181342654]33666 [Thread-20-__acker] INFO backtype.storm.daemon.executor - Processing received message source: bolt2:5, stream: __ack_ack, id: {}, [-5044490574571788562 3578453389181342654]33666 [Thread-20-__acker] INFO backtype.storm.daemon.task - Emitting direct: 6; __acker __ack_ack [-5044490574571788562]storm接收到来自kafka的消息------->========44249 [Thread-16-spout] INFO backtype.storm.daemon.task - Emitting: spout default [========, 1, 2014年12月14日 09:58:23 729]44249 [Thread-16-spout] INFO backtype.storm.daemon.task - Emitting: spout __ack_init [7355713949317680184 8328867585479832335 6]44249 [Thread-8-bolt1] INFO backtype.storm.daemon.executor - Processing received message source: spout:6, stream: default, id: {7355713949317680184=8328867585479832335}, [========, 1, 2014年12月14日 09:58:23 729]对消息加工第1次-------[arg0]:========bolt1---[arg1]:1---[arg2]:2014年12月14日 09:58:23 729------->========bolt144249 [Thread-20-__acker] INFO backtype.storm.daemon.executor - Processing received message source: spout:6, stream: __ack_init, id: {}, [7355713949317680184 8328867585479832335 6]44250 [Thread-8-bolt1] INFO backtype.storm.daemon.task - Emitting: bolt1 default [========bolt1]44250 [Thread-8-bolt1] INFO backtype.storm.daemon.task - Emitting: bolt1 __ack_ack [7355713949317680184 -6542584687991049767]44250 [Thread-14-bolt2] INFO backtype.storm.daemon.executor - Processing received message source: bolt1:2, stream: default, id: {7355713949317680184=-2980814423219223850}, [========bolt1]对消息加工第2次---------->========bolt1bolt244250 [Thread-14-bolt2] INFO backtype.storm.daemon.task - Emitting: bolt2 default [========bolt1bolt2, 1]
总结:java例子参考其他人,此时集成成功,只是在单机上的
- kafka+storm集成并运行demo-单机
- 【storm kafka】storm kafka集成
- flume+kafka+storm单机部署
- storm kafka集成
- Chap4:Storm集成Kafka
- Storm+Kafka集成
- Storm与Kafka集成
- storm集成kafka
- storm-kafka 的集成
- Storm集成Kafka
- storm集成kafka实例
- storm如何集成kafka
- kafka与storm集成
- Storm集成Kafka数据源
- Storm集成Kafka编程模型
- storm+kafka集成简单应用
- Storm学习笔记-集成Kafka
- Storm集成Kafka编程模型
- 基于adaboost的人脸检测方法
- ISBN
- NYOJ116 士兵杀敌(二)【树状数组】
- HDOJ题目2952Counting Sheep (DFS)
- highcharts之多折线图
- kafka+storm集成并运行demo-单机
- Oracle密码过期the password has expired
- git merge与rebase 区别
- 不用插件实现WordPress代码高亮显示
- ZooKeeper的简单操作
- IO流
- Android -- 查询手机上所有的能分享图片或者文字的App packageName
- 要像管理咨询一样去做软件需求调研
- LinearLayout 属性详解