Trident WordCount代码示例
来源:互联网 发布:淘宝宝贝详情免费模板 编辑:程序博客网 时间:2024/05/22 00:12
Trident WordCount代码示例
完整代码
package com.test;import backtype.storm.Config;import backtype.storm.LocalDRPC;import backtype.storm.StormSubmitter;import backtype.storm.generated.AlreadyAliveException;import backtype.storm.generated.DRPCExecutionException;import backtype.storm.generated.InvalidTopologyException;import backtype.storm.generated.StormTopology;import backtype.storm.tuple.Fields;import backtype.storm.tuple.Values;import backtype.storm.utils.DRPCClient;import org.apache.thrift7.TException;import storm.trident.TridentState;import storm.trident.TridentTopology;import storm.trident.operation.builtin.Count;import storm.trident.operation.builtin.FilterNull;import storm.trident.operation.builtin.MapGet;import storm.trident.operation.builtin.Sum;import storm.trident.testing.FixedBatchSpout;import storm.trident.testing.MemoryMapState;import storm.trident.testing.Split;public class WordCount { private static StormTopology buildTopology(LocalDRPC drpc) { /* 创建spout */ FixedBatchSpout spout = new FixedBatchSpout(new Fields("sentence"), 3, new Values("the cow jumped over the moon"), new Values("the man went to the store and bought some candy"), new Values("four score and seven years ago"), new Values("how many apples can you eat")); spout.setCycle(true); /* 创建topology */ TridentTopology topology = new TridentTopology(); /* 创建Stream spout1, 分词、统计 */ TridentState wordCounts = topology.newStream("spout1", spout) .each(new Fields("sentence"), new Split(), new Fields("word")) .groupBy(new Fields("word")) .persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count")) .parallelismHint(6); /* 创建Stream words,方法名为words,对入参分次,分别获取words 对应count,然后计算和 */ topology.newDRPCStream("words", drpc) .each(new Fields("args"), new Split(), new Fields("word")) .groupBy(new Fields("word")) .stateQuery(wordCounts, new Fields("word"), new MapGet(), new Fields("count")) .each(new Fields("count"), new FilterNull()) .aggregate(new Fields("count"), new Sum(), new Fields("sum")); return topology.build(); } public static void main(String[] args) { Config conf = new Config(); conf.setMaxSpoutPending(20); try { StormSubmitter.submitTopology("WordCount", conf, buildTopology(null)); DRPCClient client = new DRPCClient("wonderwoman", 1234); for (int i = 0; i < 100; i++) { try { System.out.println("DRPC Result: " + client.execute("words", "cat the dog jumped")); Thread.sleep(1000); } catch (InterruptedException e) { System.out.println(e.getMessage()); } } } catch (AlreadyAliveException e) { e.printStackTrace(); } catch (InvalidTopologyException e) { e.printStackTrace(); } catch (TException e) { e.printStackTrace(); } catch (DRPCExecutionException e) { e.printStackTrace(); } }}
POM文件
<?xml version="1.0" encoding="UTF-8"?><project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>wordCount</groupId> <artifactId>wordCount</artifactId> <version>1.0-SNAPSHOT</version> <packaging>jar</packaging> <dependencies> <dependency> <groupId>storm</groupId> <artifactId>storm</artifactId> <version>0.8.1</version> <scope>provided</scope> </dependency> </dependencies></project>
编译打包
mvn clean installmvn package
运行&查看
./bin/storm jar wordCount-1.0-SNAPSHOT.jar WordCount./bin/storm list
代码过程解读
- 创建spout,循环特定句子产生spout;
- 创建topology;
- 创建Stream spout1,以spout为流输入,进行分次、统计,结果以Map形式存储于内存;
记录Trident状态。 - 创建Stream words,以DRPC的words方法为流输入,对入参进行分次。依据Trident状态,查询输入的每个单词的Count,然后计算和。
- main方法,调用DRPC的words方法,计算结果。
后记
小白网上看了半天,终于能搞起来了。网上东西太乱,太杂,这是个忧伤的悖论。
0 0
- Trident WordCount代码示例
- Trident wordCount例子解读
- Storm Trident 示例
- Storm Trident 示例
- storm示例之trident
- spark streaming 接收 kafka 数据java代码WordCount示例
- WordCount代码
- WordCount-Map/Reduce示例
- Hadoop2.2 WordCount示例
- hadoop中的wordcount示例
- Tez示例(WordCount)
- Spark WordCount使用示例
- MapReduce入门示例-WordCount
- hadoop Wordcount示例 出错
- Hadoop入门-WordCount示例
- hadoop示例WordCount
- trident
- Hadoop示例程序WordCount详解
- js内置对象
- TCP与UDP区别
- 网络编程基础(3) : IO多路复用(单线程)
- 搞清tomcat中的编解码
- leetcode--Counting Bits
- Trident WordCount代码示例
- 工作变动
- linux系统调用64位汇编与32位汇编不同及兼容
- android-async-http框架源码分析
- 在tomcat中部署测试Servlet(不用eclipse或MyEclipse的tomcat插件)
- ArcGIS与R语言的Delaunay 三角网生成法
- Linux 远程复制
- IK Analyzer实现中文分词
- Java的四种引用类型分析