Flink学习笔记 --- 理解DataStream WordCount

来源:互联网 发布:cf手游刷图软件 编辑:程序博客网 时间:2024/06/05 23:02

pom,xml 内容如下:


<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">  <modelVersion>4.0.0</modelVersion>  <groupId>zetyun</groupId>  <artifactId>FlinkWordCounts</artifactId>  <version>1.0-SNAPSHOT</version>  <inceptionYear>2008</inceptionYear>  <properties>    <scala.version>2.11.0</scala.version>  </properties>  <dependencies>    <dependency>      <groupId>org.scala-lang</groupId>      <artifactId>scala-library</artifactId>      <version>${scala.version}</version>    </dependency>      <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-core -->      <dependency>          <groupId>org.apache.flink</groupId>          <artifactId>flink-core</artifactId>          <version>1.3.0</version>      </dependency>      <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-clients_2.11 -->      <dependency>          <groupId>org.apache.flink</groupId>          <artifactId>flink-clients_2.11</artifactId>          <version>1.3.0</version>      </dependency>      <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-scala_2.11 -->      <dependency>          <groupId>org.apache.flink</groupId>          <artifactId>flink-scala_2.11</artifactId>          <version>1.3.0</version>      </dependency>      <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-scala_2.11 -->      <dependency>          <groupId>org.apache.flink</groupId>          <artifactId>flink-streaming-scala_2.11</artifactId>          <version>1.3.0</version>      </dependency>      <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-core -->      <dependency>          <groupId>org.apache.flink</groupId>          <artifactId>flink-streaming-core</artifactId>          <version>0.9.1-hadoop1</version>      </dependency>  </dependencies></project>

其中的代码如下:



package zetyunimport org.apache.flink.streaming.api.scala._import org.apache.flink.streaming.api.windowing.time.Time/**  * Created by ryan on 17-7-19.  */object DataStreamWordCount {  def main(args: Array[String]) {    val env = StreamExecutionEnvironment.getExecutionEnvironment    val text = env.socketTextStream("192.168.1.81", 9999)    val counts = text.flatMap { _.toLowerCase.split("\\W+") filter { _.nonEmpty } }   // convert into lower and filter empty value      .map { (_, 1) }       // put every char in text into (char, 1) format      .keyBy(0)             // use the ( char, 1) first element hash function      .timeWindow(Time.seconds(5))  // use the window transformation      .sum(1) // sum the same key's value    counts.print    env.execute("Window Stream WordCount")  }}




原创粉丝点击