【云星数据---Apache Flink实战系列(精品版)】:Apache Flink批处理API详解与编程实战017--DateSet实用API详解017

来源:互联网 发布:notepad++ for mac 编辑:程序博客网 时间:2024/06/08 11:55

一、Flink DataSetUtils常用API

self

val self: DataSet[T]Data Set获取DataSet本身。

执行程序:

//1.创建一个 DataSet其元素为String类型val input: DataSet[String] = benv.fromElements("A", "B", "C", "D", "E", "F")//2.获取input本身val s=input.self//3.比较对象引用s==input

执行结果:

res133: Boolean = true

countElementsPerPartition

def countElementsPerPartition: DataSet[(Int, Long)]Method that goes over all the elements in each partition in order to retrieve the total number of elements.获取DataSet的每个分片中元素的个数。

执行程序:

//1.创建一个 DataSet其元素为String类型val input: DataSet[String] = benv.fromElements("A", "B", "C", "D", "E", "F")//2.设置分片前val p0=input.getParallelismval c0=input.countElementsPerPartitionc0.collect//2.设置分片后//设置并行度为3,实际上是将数据分片为3input.setParallelism(3)val p1=input.getParallelismval c1=input.countElementsPerPartitionc1.collect

执行结果:

//设置分片前p0: Int = 1c0: Seq[(Int, Long)] = Buffer((0,6))//设置分片后p1: Int = 3c1: Seq[(Int, Long)] = Buffer((0,2), (1,2), (2,2))

checksumHashCode

def checksumHashCode(): ChecksumHashCodeConvenience method to get the count (number of elements) of a DataSetas well as the checksum (sum over element hashes).获取DataSet的hashcode和元素的个数

执行程序:

//1.创建一个 DataSet其元素为String类型val input: DataSet[String] = benv.fromElements("A", "B", "C", "D", "E", "F")//2.获取DataSet的hashcode和元素的个数input.checksumHashCode

执行结果:

res140: org.apache.flink.api.java.Utils.ChecksumHashCode = ChecksumHashCode 0x0000000000000195, count 6

zipWithIndex

defzipWithIndex: DataSet[(Long, T)]Method that takes a set of subtask index, total number of elements mappingsand assigns ids to all the elements from the input data set.元素和元素的下标进行zip操作。

执行程序:

//1.创建一个 DataSet其元素为String类型val input: DataSet[String] = benv.fromElements("A", "B", "C", "D", "E", "F")//2.元素和元素的下标进行zip操作。val result: DataSet[(Long, String)] = input.zipWithIndex//3.显示结果result.collect

执行结果:

res134: Seq[(Long, String)] = Buffer((0,A), (1,B), (2,C), (3,D), (4,E), (5,F))

flink web ui中的执行效果:

这里写图片描述

阅读全文
0 0
原创粉丝点击