spark源码-stage和 task
来源:互联网 发布:php 上传文件类型 编辑:程序博客网 时间:2024/06/05 09:56
1DAGScheduler
private def submitMissingTasks(stage: Stage) {
logDebug("submitMissingTasks(" + stage + ")")
// Get our pending tasks and remember them in our pendingTasks entry
val myPending = pendingTasks.getOrElseUpdate(stage, new HashSet)
myPending.clear()
var tasks = ArrayBuffer[Task[_]]()
if (stage.isShuffleMap) {
for (p <- 0 until stage.numPartitions if stage.outputLocs(p) == Nil) {
val locs = getPreferredLocs(stage.rdd, p)
tasks += new ShuffleMapTask(stage.id, stage.rdd, stage.shuffleDep.get, p, locs)
}
} else {
// This is a final stage; figure out its job's missing partitions
val job = resultStageToJob(stage)
/
private[spark] class ActiveJob(
val runId: Int,
val finalStage: Stage,
val func: (TaskContext, Iterator[_]) => _,
val partitions: Array[Int],
val callSite: String,
val listener: JobListener) {
//此处由rdd的分区数量决定的job的分区
val numPartitions = partitions.length
val finished = Array.fill[Boolean](numPartitions)(false)
var numFinished = 0
}
def runApproximateJob[T, U, R](
rdd: RDD[T],
func: (TaskContext, Iterator[T]) => U,
evaluator: ApproximateEvaluator[U, R],
callSite: String,
timeout: Long)
: PartialResult[R] =
{
val listener = new ApproximateActionListener(rdd, func, evaluator, timeout)
val func2 = func.asInstanceOf[(TaskContext, Iterator[_]) => _]
val partitions = (0 until rdd.partitions.size).toArray
eventQueue.put(JobSubmitted(rdd, func2, partitions, false, callSite, listener))
return listener.awaitResult() // Will throw an exception if the job fails
}
- spark源码-stage和 task
- Spark源码-DAGScheduler中stage划分和task最佳位置
- 【Spark】Stage生成和Stage源码浅析
- spark stage的划分和task分配
- spark源码学习(五):stage的划分和task的创建
- Spark的Stage划分和task最佳位置算法
- Spark中Job、Stage和Task的关系
- spark job, stage ,task介绍。
- spark job, stage, task介绍
- Spark调度系列-----5.Spark task和Stage的跳过执行(ui显示task和stage skipped)
- Spark源码--Stage
- 源码-stage->task->taskSet->executor
- Spark中job、stage、task的划分+源码执行过程分析
- Spark job, stage, task, partition相关问题
- 【Spark】worker、executor、stage、task、partition关系
- Spark源码解读之Stage划分和提交
- 34 Spark中任务处理的Stage划分和Task最佳位置算法
- Spark中Task,Partition,RDD、节点数、Executor数、core数目的关系和Application,Driver,Job,Task,Stage理解
- NYOJ 902 一道难题
- hdu 1015 Safecracker 暴力dfs
- java UDPSocket 简单示例
- 黑马程序员_面向对象的三大特性之——继承
- android应用开发之转屏(拷贝篇)
- spark源码-stage和 task
- ZOJ - 3627 Treasure Hunt II
- hdu 1405 The Last Practice 数论水题
- 架构学习案例辑录
- 大整数的除法 九度1138
- 改变Oracle数据库连接端口
- Oracle Database 12c中对分区功能做了较多的调整
- unix环境编程卷2之Posix共享内存区
- sift之一:高斯金字塔的构建