Spark-Streaming KafkaDirectDStream checkpoint的原理
来源:互联网 发布:小满crm软件 编辑:程序博客网 时间:2024/06/06 02:46
JobGenrerator.generateJobs负责Streaming Job的产生,产生并且提交执行Job之后,会发送DoCheckpoint事件,源码如下:
- private def generateJobs(time: Time) {
-
-
-
- SparkEnv.set(ssc.env)
- Try {
- jobScheduler.receiverTracker.allocateBlocksToBatch(time)
- graph.generateJobs(time)
- } match {
- case Success(jobs) =>
- val streamIdToInputInfos = jobScheduler.inputInfoTracker.getInfo(time)
- val streamIdToNumRecords = streamIdToInputInfos.mapValues(_.numRecords)
- jobScheduler.submitJobSet(JobSet(time, jobs, streamIdToNumRecords))
- case Failure(e) =>
- jobScheduler.reportError("Error generating jobs for time " + time, e)
- }
- eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = false))
-
- }
从上面代码可知道,每次产生Streaming Job都会触发Checkpoint的执行
JobGenerator.processEvent方法接收到DoCheckpoint事件后,调用JobGenerator.doCheckpoint方法进行Checkpoint处理
JobGenerator.doCheckpoint方法调用DStreamGraph.updateCheckpointData对输出DStream进行Checkpoint,
然后调用CheckpointWriter将Checkpoint信息写到Checkpoint目录,源码如下:
- private def doCheckpoint(time: Time, clearCheckpointDataLater: Boolean) {
- if (shouldCheckpoint && (time - graph.zeroTime).isMultipleOf(ssc.checkpointDuration)) {
- logInfo("Checkpointing graph for time " + time)
- ssc.graph.updateCheckpointData(time)
- checkpointWriter.write(new Checkpoint(ssc, time), clearCheckpointDataLater)
- }
- }
下面来看看如何对DStream进行Checkpoint的- def updateCheckpointData(time: Time) {
- logInfo("Updating checkpoint data for time " + time)
- this.synchronized {
- outputStreams.foreach(_.updateCheckpointData(time))
- }
- logInfo("Updated checkpoint data for time " + time)
- }
可见DStreamGraph.updateCheckpointData方法所作的工作是将输出流中的每个DStream信息转化成相应的Checkpoint信息对每个DStream信息转化成Checkpoint发生在DStream.updateCheckpointData方法,这个方法更新DStream的Checkpoint信息,并且更新DStream依赖的所有DStream的Checkpoint信息,源码如下:
- private[streaming] def updateCheckpointData(currentTime: Time) {
- logDebug("Updating checkpoint data for time " + currentTime)
- checkpointData.update(currentTime)
- dependencies.foreach(_.updateCheckpointData(currentTime))
- logDebug("Updated checkpoint data for time " + currentTime + ": " + checkpointData)
- }
DirectKafkaInputDStreamCheckpointData的Checkpoint信息更新如下:
- <pre name="code" class="java">def batchForTime: mutable.HashMap[Time, Array[(String, Int, Long, Long)]] = {
- data.asInstanceOf[mutable.HashMap[Time, Array[OffsetRange.OffsetRangeTuple]]]
- }
-
- override def update(time: Time) {
- batchForTime.clear()
- generatedRDDs.foreach { kv =>
- val a = kv._2.asInstanceOf[KafkaRDD[K, V, U, T, R]].offsetRanges.map(_.toTuple).toArray
- batchForTime += kv._1 -> a
- }
- }
在Spark中,将所有没有成功完成的Job放在了JobScheduler.jobSets中,Job成功完成之后再将它从JobScheduler.jobSets删除,源码如下:- def submitJobSet(jobSet: JobSet) {
- if (jobSet.jobs.isEmpty) {
- logInfo("No jobs added for time " + jobSet.time)
- } else {
- listenerBus.post(StreamingListenerBatchSubmitted(jobSet.toBatchInfo))
- jobSets.put(jobSet.time, jobSet)
- jobSet.jobs.foreach(job => jobExecutor.execute(new JobHandler(job)))
- logInfo("Added jobs for time " + jobSet.time)
- }
- }
- private def handleJobCompletion(job: Job) {
- job.result match {
- case Success(_) =>
- val jobSet = jobSets.get(job.time)
- jobSet.handleJobCompletion(job)
- logInfo("Finished job " + job.id + " from job set of time " + jobSet.time)
- if (jobSet.hasCompleted) {
- jobSets.remove(jobSet.time)
- jobGenerator.onBatchCompletion(jobSet.time)
- logInfo("Total delay: %.3f s for time %s (execution: %.3f s)".format(
- jobSet.totalDelay / 1000.0, jobSet.time.toString,
- jobSet.processingDelay / 1000.0
- ))
- listenerBus.post(StreamingListenerBatchCompleted(jobSet.toBatchInfo))
- }
- case Failure(e) =>
- reportError("Error running job " + job, e)
- }
- }
Checkpoint.graph对应于Spark-streaming应用的DStreamGraph,DStreamGraph.outputStreams包含了要Checkpoint的DStream信息。Checkpoint.pendingTimes对应没有成功完成的Job,因此在将Checkpoint信息保存到HDFS的时候,这些信息都会被Checkpoint。要想上一次Spark-streaming Application产生的Checkpoint信息有用,在创建StreamingContext的时候,必须要传入Checkpoint信息。上一次Spark-streaming Application产生的Checkpoint信息的读取可以通过调用CheckpointReader.read方法。
如果创建StreamingContext传入上次执行产生的Checkpoint信息则会使用Checkpoint包含的DStreamGraph作为本次Application的DStreamGraph,它里面包含了需要Checkpoint的DStream信息。然后根据DStreamGraph恢复上一次执行时的DStream信息。源码如下:
- private[streaming] val graph: DStreamGraph = {
- if (isCheckpointPresent) {
- cp_.graph.setContext(this)
- cp_.graph.restoreCheckpointData()
- cp_.graph
- } else {
- require(batchDur_ != null, "Batch duration for StreamingContext cannot be null")
- val newGraph = new DStreamGraph()
- newGraph.setBatchDuration(batchDur_)
- newGraph
- }
- }
JobGenerator.start开始Streaming Job的产生,如果存在Checkpoint信息,则调用JobGenerator.restart开始Spark-streaming Job的执行,在这个方法里面会将上一次Application执行时已经产生但是还没有成功执行完成的Streaming Job先恢复出来,然后再把从崩溃到重新执行的时间之间没有产生Job补上,然后让Spark先执行这些丢失的Job,源码如下:- def start(): Unit = synchronized {
- if (eventLoop != null) return
-
- eventLoop = new EventLoop[JobGeneratorEvent]("JobGenerator") {
- override protected def onReceive(event: JobGeneratorEvent): Unit = processEvent(event)
-
- override protected def onError(e: Throwable): Unit = {
- jobScheduler.reportError("Error in job generator", e)
- }
- }
- eventLoop.start()
-
- if (ssc.isCheckpointPresent) {
- restart()
- } else {
- startFirstTime()
- }
- }
- private def restart() {
-
-
-
- if (clock.isInstanceOf[ManualClock]) {
- val lastTime = ssc.initialCheckpoint.checkpointTime.milliseconds
- val jumpTime = ssc.sc.conf.getLong("spark.streaming.manualClock.jump", 0)
- clock.asInstanceOf[ManualClock].setTime(lastTime + jumpTime)
- }
-
- val batchDuration = ssc.graph.batchDuration
-
-
-
- val checkpointTime = ssc.initialCheckpoint.checkpointTime
- val restartTime = new Time(timer.getRestartTime(graph.zeroTime.milliseconds))
- val downTimes = checkpointTime.until(restartTime, batchDuration)
- logInfo("Batches during down time (" + downTimes.size + " batches): "
- + downTimes.mkString(", "))
-
-
- val pendingTimes = ssc.initialCheckpoint.pendingTimes.sorted(Time.ordering)
- logInfo("Batches pending processing (" + pendingTimes.size + " batches): " +
- pendingTimes.mkString(", "))
-
- val timesToReschedule = (pendingTimes ++ downTimes).distinct.sorted(Time.ordering)
- logInfo("Batches to reschedule (" + timesToReschedule.size + " batches): " +
- timesToReschedule.mkString(", "))
- timesToReschedule.foreach { time =>
-
-
-
- jobScheduler.receiverTracker.allocateBlocksToBatch(time)
- jobScheduler.submitJobSet(JobSet(time, graph.generateJobs(time)))
- }
-
-
- timer.start(restartTime.milliseconds)
- logInfo("Restarted JobGenerator at " + restartTime)
- }
0 0