STORM入门之(TridentAPI,partition)
来源:互联网 发布:淘宝账号是不是会员名 编辑:程序博客网 时间:2024/06/06 00:46
partitionAggregate
partitionAggregate
会在一批 tuple 的每个分区上执行一个指定的功能操作。以下面这段代码为例:
mystream.partitionAggregate(new Fields("b"), new Sum(), new Fields("sum"))
假如输入流中包含有 “a”、“b” 两个域并且有以下几个 tuple 块:
Partition 0:["a", 1]["b", 2]Partition 1:["a", 3]["c", 8]Partition 2:["e", 1]["d", 9]["d", 10]
经过上面的代码之后,输出就会变成带有一个名为 “sum” 的域的数据流,其中的 tuple 就是这样的:
Partition 0:[3]Partition 1:[11]Partition 2:[20]
Toplogy
首先需要分区,方法为partitionBy按照field进行分区
package storm.topology;import org.apache.storm.Config;import org.apache.storm.LocalCluster;import org.apache.storm.generated.StormTopology;import org.apache.storm.trident.TridentTopology;import org.apache.storm.trident.operation.builtin.Count;import org.apache.storm.tuple.Fields;import org.apache.storm.tuple.Values;import storm.spout.FixedBatchSpout;import storm.trident.Split;import storm.trident.Statistics;import storm.trident.WordAggregat;/** * Created with IntelliJ IDEA. * User: Administrator * Date: 17-9-20 * Time: 上午10:48 * To change this template use File | Settings | File Templates. */public class TridentAggreTopology { public static void main(String args[]){ TridentTopology topology = new TridentTopology(); FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence"), 1, new Values("the cow jumped "), new Values("cow jumped"), new Values("jumped"));// spout.setCycle(true); topology.newStream("batch-spout",spout) .each(new Fields("sentence"), new Split(), new Fields("word")) //分割 .partitionBy(new Fields("word")) .partitionAggregate(new Fields("word"),new WordAggregat(), new Fields("agg")); StormTopology stormTopology = topology.build(); LocalCluster cluster = new LocalCluster(); Config conf = new Config(); conf.setDebug(true); cluster.submitTopology("soc", conf,stormTopology); }}
WordAggreat
package storm.trident;import org.apache.storm.trident.operation.BaseAggregator;import org.apache.storm.trident.operation.TridentCollector;import org.apache.storm.trident.tuple.TridentTuple;import org.apache.storm.tuple.Values;import java.util.HashMap;import java.util.Map;/** * Created with IntelliJ IDEA. * User: Administrator * Date: 17-9-1 * Time: 上午10:36 * To change this template use File | Settings | File Templates. */public class WordAggregat extends BaseAggregator<Map<String, Integer>> { public static Map<String, Integer> map = new HashMap<String, Integer>(); @Override public Map<String, Integer> init(Object batchId, TridentCollector collector) { return new HashMap<String, Integer>(); } @Override public void aggregate(Map<String, Integer> val, TridentTuple tuple, TridentCollector collector) { String location = tuple.getString(0); Integer i = map.get(location); if(null == i){ i = 0; }else{ i = i+1; } map.put(location, i); } @Override public void complete(Map<String, Integer> val, TridentCollector collector) { for (String key : map.keySet()) { System.out.println("key= "+ key + " and value= " + map.get(key)); } collector.emit(new Values(map)); }}
结果
我们预期的结果为聚合单词出现的次数
阅读全文
1 0
- STORM入门之(TridentAPI,partition)
- STORM入门之(TridentAPI,Each)
- STORM入门之(TridentAPI,Aggregation)
- STORM入门之(基本Shell命令)
- STORM入门之(Topology简易Demo)
- STORM入门之(集成KafkaBolt)
- STORM入门之(集成KafkaSpout)
- STORM入门之(集成ElasticSearch)
- STORM入门之(集成Redis)
- STORM入门之(TridentTopology集成Kafka)
- STORM入门之(yaml配置文件说明)
- STORM入门之(集成HDFS)
- Storm入门之第一章
- Storm入门之第一章
- Storm入门之第一章
- Storm入门之第一章
- Storm入门之附录A
- Storm入门之附录B
- 变量,环境变量,PATH变量
- java中怎么统计一个字符串中每个字符的出现次数
- 判断一个字符串中出现次数最多的字符,并统计这个次数
- 二分法/参考算法笔记
- 21:单词替换
- STORM入门之(TridentAPI,partition)
- 关于Nd4jBackend$NoAvailableBackendException: 的异常问题
- 最近要深一步用到GPIO口控制,写个博客记录下Kernel层的GPIO学习过程
- hibernate详解(四)---双向多对一案例
- 教您恢复回收站删除的文件
- left join (on 和 where条件放置的区别)
- spring cloud eureka server 配置
- 【Python-2.7】list类型
- Kernel 中的 GPIO 定义和控制