Flink Runtime 1.0 Notes: Plan 2 Task
来源:互联网 发布:工业设计所需软件 编辑:程序博客网 时间:2024/06/12 21:44
About
I will try to give the mainline of how does Flink
buildint the logical plan 2 physical plan 2 task.
Main classes and methods are mentioned.
Format explaination
this is
Class
this is method()
this is constant
Logical Plan 2 Physcial Plan 2 Task
JobGraph
-> ExecutionGraph
, on JobManager
JobManager
receiveJobGraph
submitted from Client. using zookeeper to persist itattachJobGraph(), add
JobVertex
toExecutionGraph
, do the transformationscheduleForExecution(), give
Scheduler
toExecutionGraph
, do the kick-off for tasks
Logical Plan
JobGraph
is composed of a map of JobVertex
, which is called taskVertices.
JobVertex
is composed of a list of IntermediateDataSet
, a list ofJobEdge
as inputs, an operatorName, an invokableClassName, the parallelism.
JobEdge
has DistributionPattern
, one IntermediateDataSet
as source.
Physical Plan
ExecutionGraph
is composed of a map of ExecutionJobVertex
as its tasks.
ExecutionJobVertex
has the whole ExecutionGraph
, one JobVertex
, an array of ExecutionVertex
as the taskVertices, a list of IntermediateResult
as inputs, an array of IntermediateResult
as the producedDataSets. These are inited during construction from JobVertex
.
ExecutionVertex
is created during the construction of ExecutionJobVertex
, by its vertex parallelism. It is a single subtask of the execution. Execution
manages the real execution in it.
Besides, ExecutionEdge
is created during ExecutionJobVertex
.connectToPredecessors(), JobEdge
maps to ExecutionEdge
during ExecutionVertex
.connectSource(), where DistributionPattern
takes effect.
If DistributionPattern
is ALL_TO_ALL, each ExecutionVertex
has source partition num of ExecutionEdge
; else if DistributionPattern
is POINTWISE, compare the paralellism with source partition num, if equal, make it one-to-one, else if bigger, zero-or-one-to-one, else, one-to-many.
Schedule 4 Execution
Once all preparing phases done, JobManager
invokes ExecutionGraph
.schedulerForExecution(), given a Scheduler
coming from the construction.
Default schedule mode is FROM_SOURCES, traverse the vertices then choose the input vertex to do ExecutionJobVertex
.scheduleAll(), else in ALL mode, all vertices are scheduled.
The real tasks in it are ExecutionVertex
s, location hosts may be set according to instance info from scheduler, then each ExecutionVertex
.scheduleForExecution() invoked.
Inside ExecutionVertex
, invoke Execution
.scheduleForExecution(), which get slots from scheduler and do the deploy thing on assigned slots, either it is QUEUED scheduled or immediately scheduled(default is not queued).
According to vertex SlotSharingGroup
and CoLocationConstraint
, a ScheduledUnit
is created, which is handled by Scheduler
to do the task scheduling things. Considering location preferred and sharing group, a Slot
is finally returned to Execution
to do deployToSlot().
deployToSlot() aims to distribute tasks, a TaskDeploymentDescriptor
is firstly created by ExecutionVertex
, which contains job info, task info, configuration, className, and jar files, etc. Secondly, a SubmitTask
message is send to the target Instance
of the Slot
, then goes the TaskManager
.
Task Launched
Once TaskManager
receives SubmitTask
msg, Task
then will be created and started, an ack msg sent back to sender.
Once started, a handleful of things will be done
- checking and updating
ExecutionState
- loading classes and the invokable
- network being controlled by
NetworkEnvironment
- creating
TaskInputSplitProvider
to ask forNextInputSplit
(it should be source) RuntimeEnvironment
created and set to invokable, managers likeMemoryManager
,IOManager
,BroadcastVariableManager
all sit thereStateHandle
created and set to invokable- invokable invoked
Task
is a wrapper as a thread, more specific exception handling, life management, observing things are omitted.
:)
- Flink Runtime 1.0 Notes: Plan 2 Task
- Flink Runtime 1.0 Notes: Task Execution(1)
- Flink运行时之TaskManager执行Task
- Marketing Plan Pro 出现Runtime Error '457'
- Flink学习笔记:2、Flink介绍
- Flink中task之间的数据交换机制
- Flink入门教程--Task Lifecycle(任务的生命周期简介)
- Flink
- database autocreate winrar file with window plan-task
- How to show query plan and runtime statistic in Derby
- Flink SQL 1.0+ UT Cases
- 【Flink系列2】时间窗口
- plan
- plan
- Plan
- plan
- plan
- PLAN...
- Android中Strings资源一些冷门用法
- 编译原理-算符优先
- JAVA并发编程:线程池Executors
- 网站渗透思路(一)
- HTTP协议的特点
- Flink Runtime 1.0 Notes: Plan 2 Task
- maven镜像配置
- chrome浏览器ctrl+f5强刷仍然显示from cache的实现
- 买卖股票的最佳时机
- linux字符cdev和inode的联系
- SQL行转列汇总
- android 注解完成广播事件
- poj 2594 Treasure Exploration (最小路径覆盖+Floyd缩点)
- NYOJ 47 过河问题