storm环境搭建及demo

来源：互联网发布：淘宝如何上下架宝贝编辑：程序博客网时间：2024/06/05 04:21

概述
文件下载
系统环境搭建和配置
storm demo
Q&A
参考
- 概述

Storm是一个开源的分布式实时计算系统，可以简单、可靠的处理大量的数据流。被称作“实时的hadoop”。Storm有很多使用场景：如实时分析，在线机器学习，持续计算，分布式RPC，ETL等等。Storm支持水平扩展，具有高容错性，保证每个消息都会得到处理，而且处理速度很快（在一个小集群中，每个结点每秒可以处理数以百万计的消息）。

* 文件下载1. zookeeper下载

下载地址：http://apache.fayea.com/zookeeper/zookeeper-3.3.6/zookeeper-3.3.6.tar.gz

1. storm下载

下载地址：http://mirrors.hust.edu.cn/apache/storm/apache-storm-1.0.3/apache-storm-1.0.3.tar.gz

* 系统环境搭建和配置1. 配置zookeeper（此处使用单节点配置）*     * 上传zookeeper-3.3.6.tar.gz到centos服务器目录/home/temp    * 解压tar -zxvf zookeeper-3.3.6.tar.gz    * 移动到/usr/lib/zookeeper，mv zookeeper-3.3.6  /usr/lib/zookeeper    * 配置zoo.cfg，cd /usr/lib/zookeeper/conf，cp zoo_sample.cfg zoo.cfg，配置示例参考：

# Licensed to the Apache Software Foundation (ASF) under one or more# contributor license agreements.  See the NOTICE file distributed with# this work for additional information regarding copyright ownership.# The ASF licenses this file to You under the Apache License, Version 2.0# (the "License"); you may not use this file except in compliance with# the License.  You may obtain a copy of the License at##     http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.maxClientCnxns=50# The number of milliseconds of each ticktickTime=2000# The number of ticks that the initial# synchronization phase can takeinitLimit=10# The number of ticks that can pass between# sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.dataDir=/var/lib/zookeeper/data# the port at which the clients will connectclientPort=2181# the directory where the transaction logs are stored.dataLogDir=/var/lib/zookeeperserver.1=hadooplearn:2888:3888

*     * 配置myid，echo 1>myid，移动myid文件到/var/lib/zookeeper/data目录，mv myid /var/lib/zookeeper/data/    * 启动zookeeper服务，cd /usr/lib/zookeeper，bin/zkServer.sh start    * 验证zookeeper服务，bin/zkServer.sh status    * zookeeper服务部署成功1. 配置storm（此处使用单节点）*     * 上传apache-storm-1.0.3.tar.gz到centos服务器目录/home/temp    * 解压，tar -zxvf apache-storm-1.0.3.tar.gz    * 移动apache-storm-1.0.3到/usr/lib/apache-storm    * 配置storm.yaml，cd /usr/lib/apache-storm/conf，配置示例参考：

# Licensed to the Apache Software Foundation (ASF) under one# or more contributor license agreements.  See the NOTICE file# distributed with this work for additional information# regarding copyright ownership.  The ASF licenses this file# to you under the Apache License, Version 2.0 (the# "License"); you may not use this file except in compliance# with the License.  You may obtain a copy of the License at## http://www.apache.org/licenses/LICENSE-2.0## Unless required by applicable law or agreed to in writing, software# distributed under the License is distributed on an "AS IS" BASIS,# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.# See the License for the specific language governing permissions and# limitations under the License.

########### These MUST be filled in for a storm configurationstorm.zookeeper.servers:    - "hadooplearn"#    - "server2"nimbus.seeds: ["hadooplearn"]### ##### These may optionally be filled in:#   ## List of custom serializations# topology.kryo.register:#     - org.mycompany.MyType#     - org.mycompany.MyType2: org.mycompany.MyType2Serializer### List of custom kryo decorators# topology.kryo.decorators:#     - org.mycompany.MyDecorator### Locations of the drpc servers# drpc.servers:#     - "server1"#     - "server2"## Metrics Consumers# topology.metrics.consumer.register:#   - class: "org.apache.storm.metric.LoggingMetricsConsumer"#     parallelism.hint: 1#   - class: "org.mycompany.MyMetricsConsumer"#     parallelism.hint: 1#     argument:#       - endpoint: "metrics-collector.mycompany.org"

*     * 启动nimbus，cd /usr/lib/apache-storm/bin，./storm nimbus    * 启动storm web管理，cd  /usr/lib/apache-storm/bin，./storm ui    * 启动supervisor，cd  /usr/lib/apache-storm/bin，./storm supervisor    * 以后台服务运行的启动方式为：

nohup ./storm nimbus 1>/dev/null 2>&1 &
nohup ./storm ui 1>/dev/null 2>&1 &
nohup ./storm supervisor 1>/dev/null 2>&1 &

1. storm demo

下载地址：http://dl.download.csdn.net/down11/20170705/66e92331900ddcf5d5b88c37650b23dd.zip?response-content-disposition=attachment%3Bfilename%3D%22weekend-storm.zip%22&OSSAccessKeyId=9q6nvzoJGowBj4q1&Expires=1499244457&Signature=N%2BB5mmaOVn6OCxmCA5M6yLjjmJI%3D

* Q&A1. 如何杀死storm作业？

cd /usr/lib/apache-storm/bin，./storm kill ‘作业名称’，入示例中的作业名称为：demotopo
1. 如何开发spout？
继承BaseRichSpout，实现open，declareOutputFields，nextTuple三个方法
1. 如何开发bolt？
继承BaseBasicBolt，实现prepare，declareOutputFields，execute
1. storm作业提交过程是怎样的？
创建TopologyBuilder实例builder
设置spout，builder.setSpout
多个spout时，下一个spout要指定本spout的输入spout，如：builder.setBolt(“upperbolt”, new UpperBolt(), 4).shuffleGrouping(“randomspout”);
用builder来创建一个topology，StormTopology demotop = builder.createTopology();
配置一些topology在集群中运行时的参数
将这个topology提交给storm集群运行：StormSubmitter.submitTopology(“demotopo”, conf, demotop);
运行jar包，提交Topologies，命令格式：storm jar 【jar路径】【拓扑包名.拓扑类名】【stormIP地址】【storm端口】【拓扑名称】【参数】eg：storm jar /home/storm/storm-starter.jar storm.starter.WordCountTopology wordcountTop;
- 参考

http://zookeeper.apache.org/doc/r3.3.6/zookeeperStarted.html

http://storm.apache.org/releases/1.0.3/Setting-up-a-Storm-cluster.html

阅读全文

0 0