Flink WordCount实例讲解

来源:互联网 发布:电脑网络映射是断开的 编辑:程序博客网 时间:2024/06/16 11:11

    摘要:本文主要讲了一个Flink的简单的可以本地运行的wordCount程序

本文工程下载:https://github.com/appleappleapple/BigDataLearning

1、工程目录结构


2、pom文件

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd"><modelVersion>4.0.0</modelVersion><groupId>com.lin</groupId><artifactId>Flink-Demo</artifactId><version>0.0.1-SNAPSHOT</version><name>${project.artifactId}</name><properties><maven.compiler.source>1.6</maven.compiler.source><maven.compiler.target>1.6</maven.compiler.target><encoding>UTF-8</encoding><scala.version>2.11.5</scala.version><scala.compat.version>2.11</scala.compat.version></properties><dependencies><dependency><groupId>org.scala-lang</groupId><artifactId>scala-library</artifactId><version>${scala.version}</version></dependency><dependency><groupId>org.apache.flink</groupId><artifactId>flink-scala_2.11</artifactId><version>1.1.2</version></dependency><dependency><groupId>org.apache.flink</groupId><artifactId>flink-clients_2.11</artifactId><version>1.1.2</version></dependency><dependency><groupId>org.apache.flink</groupId><artifactId>flink-streaming-scala_2.11</artifactId><version>1.1.2</version></dependency></dependencies><build><sourceDirectory>src/main/scala</sourceDirectory><testSourceDirectory>src/test/scala</testSourceDirectory><plugins><plugin><!-- see http://davidb.github.com/scala-maven-plugin --><groupId>net.alchim31.maven</groupId><artifactId>scala-maven-plugin</artifactId><version>3.2.0</version><executions><execution><goals><goal>compile</goal><goal>testCompile</goal></goals><configuration><args><arg>-make:transitive</arg><arg>-dependencyfile</arg><arg>${project.build.directory}/.scala_dependencies</arg></args></configuration></execution></executions></plugin><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-surefire-plugin</artifactId><version>2.18.1</version><configuration><useFile>false</useFile><disableXmlReport>true</disableXmlReport><!-- If you have classpath issue like NoDefClassError,... --><!-- useManifestOnlyJar>false</useManifestOnlyJar --><includes><include>**/*Test.*</include><include>**/*Suite.*</include></includes></configuration></plugin></plugins></build></project>

3、代码

package com.lin.flink.demoimport org.apache.flink.api.java.utils.ParameterToolimport org.apache.flink.api.scala._import org.apache.flink.core.fs.FileSystem.WriteMode/** * 可以直接本地运行 */object WordCount {  def main(args: Array[String]) {    val env = ExecutionEnvironment.createLocalEnvironment(1)    //从本地读取文件    val text = env.readTextFile("D:/Java/flink-1.2.0-bin-hadoop27-scala_2.11/flink-1.2.0-bin-hadoop27-scala_2.11/flink-1.2.0/README.txt")    //单词统计    val counts = text.flatMap { _.toLowerCase.split("\\W+") filter { _.nonEmpty } }      .map { (_, 1) }      .groupBy(0)      .sum(1)    //输出结果    counts.print()    //保存结果到txt文件    counts.writeAsText("D:/output.txt", WriteMode.OVERWRITE)    env.execute("Scala WordCount Example")  }}
直接运行,输出结果如下

保存的外部文件


本文工程下载:https://github.com/appleappleapple/BigDataLearning

0 0
原创粉丝点击