第13课:Spark Streaming源码解读之Driver容错安全性
来源:互联网 发布:tps跨境电商是网络传销 编辑:程序博客网 时间:2024/06/06 23:51
本期内容:
1.ReceivedBlockTracker容错安全性
2.DStream和JobGenerator容错安全性
因为Driver指挥了整个spark程序的运行,所以driver的安全性至关重要。我们这里主要从sparkstreaming的角度谈driver的安全性,例如通过wal保存处理的数据的元数据,在驱动层面/调度逻辑的安全容错方面才用checkpoint。
ReceivedBlockTracker负责管理sparkstreaming程序运行的数据的元数据,是数据层面;DStream和JobGenerator是框架调度的核心,是业务逻辑层面和作业生成层面。这三者都需要保存状态,都需要容错。
容错会保存历史状态,在出错后基于保存的状态进行回复。
一。ReceivedBlockTracker容错安全性.
ReceivedBlockTracker,在注释中说的很明白:
/** * Class that keep track of all the received blocks, and allocate them to batches * when required. All actions taken by this class can be saved to a write ahead log * (if a checkpoint directory has been provided), so that the state of the tracker * (received blocks and block-to-batch allocations) can be recovered after driver failure. * * Note that when any instance of this class is created with a checkpoint directory, * it will try reading events from logs in the directory. */private[streaming] class ReceivedBlockTracker( conf: SparkConf, hadoopConf: Configuration, streamIds: Seq[Int], clock: Clock, recoverFromWriteAheadLog: Boolean, checkpointDirOption: Option[String]) extends Logging {
ReceivedBlockTracker收到数据后是如何处理的呢?这里收到的数据是元数据receivedBlockInfo,是个简单的case class, 保存了(streamId,numRecords: metadata和ReceivedBlockStoreResult)。
ReceivedBlockTracker收到receiverSupervisorImpl汇报的receiver接受的数据的元数据后,首先通过writeToLog保存元数据,这就是所谓的冷备份;然后才会写入内存中的数据结构streamIdToUnallocatedBlockQueues(是个hashmap[Int, ReceivedBlockQueue],其中ReceivedBlockQueue就是Queue[ReceivedBlockInfo]),以供jobGenerator去使用。
/** Add received block. This event will get written to the write ahead log (if enabled). */ def addBlock(receivedBlockInfo: ReceivedBlockInfo): Boolean = { try { val writeResult = writeToLog(BlockAdditionEvent(receivedBlockInfo)) if (writeResult) { synchronized { getReceivedBlockQueue(receivedBlockInfo.streamId) += receivedBlockInfo } logDebug(s"Stream ${receivedBlockInfo.streamId} received " + s"block ${receivedBlockInfo.blockStoreResult.blockId}") } else { logDebug(s"Failed to acknowledge stream ${receivedBlockInfo.streamId} receiving " + s"block ${receivedBlockInfo.blockStoreResult.blockId} in the Write Ahead Log.") } writeResult } catch { case NonFatal(e) => logError(s"Error adding block $receivedBlockInfo", e) false } }
以上保存的是接受的数据的元数据,还没有进行分配。记下里看看是怎么分配的。
在分配给Job之前,同样首先writeToLog来保存元数据,这样后续失败时可以从Log恢复。
分配后的数据以时间窗口为Key,保存在内存数据结构timeToAllocatedBlocks 这个HashMap[Time, AllocatedBlocks],其中AllocatedBlocks是个case class,保存了streamIdToAllocatedBlocks这个Map[Int, Seq[ReceivedBlockInfo]。
由于内存数据结构timeToAllocatedBlocks可以保存很多时间窗口的数据,这就为state和wiondow操作提供了可能。
/** * Allocate all unallocated blocks to the given batch. * This event will get written to the write ahead log (if enabled). */ def allocateBlocksToBatch(batchTime: Time): Unit = synchronized { if (lastAllocatedBatchTime == null || batchTime > lastAllocatedBatchTime) { //可以同时接受来自不同的数据源的数据 //获得所有数据源接受的数据 val streamIdToBlocks = streamIds.map { streamId => (streamId, getReceivedBlockQueue(streamId).dequeueAll(x => true)) }.toMap //分配接受的数据 val allocatedBlocks = AllocatedBlocks(streamIdToBlocks) //首先写Log if (writeToLog(BatchAllocationEvent(batchTime, allocatedBlocks))) { timeToAllocatedBlocks.put(batchTime, allocatedBlocks) lastAllocatedBatchTime = batchTime } else { logInfo(s"Possibly processed batch $batchTime need to be processed again in WAL recovery") } } else { // This situation occurs when: // 1. WAL is ended with BatchAllocationEvent, but without BatchCleanupEvent, // possibly processed batch job or half-processed batch job need to be processed again, // so the batchTime will be equal to lastAllocatedBatchTime. // 2. Slow checkpointing makes recovered batch time older than WAL recovered // lastAllocatedBatchTime. // This situation will only occurs in recovery time. logInfo(s"Possibly processed batch $batchTime need to be processed again in WAL recovery") } }
需要说明的是,allocateBlocksToBatch(batchTime: Time)这里的输入参数batchTime是由jobGenerator在generateJobs(time: Time) 方法中通过语句jobScheduler.receiverTracker.allocateBlocksToBatch(time)传过来的,该time最初是由定时器RecurringTimer(clock, ssc.graph.batchDuration.milliseconds,
longTime => eventLoop.post(GenerateJobs(new Time(longTime))), “JobGenerator”)传过来的。
定时器在sparkStreaming中非常重要。
上面讲到了接受到数据后,ReceivedBlockTracker会把元数据wal, 在销毁数据的时候,ReceivedBlockTracker同样会把元数据wal:
* Clean up block information of old batches. If waitForCompletion is true, this method * returns only after the files are cleaned up. */ def cleanupOldBatches(cleanupThreshTime: Time, waitForCompletion: Boolean): Unit = synchronized { require(cleanupThreshTime.milliseconds < clock.getTimeMillis()) val timesToCleanup = timeToAllocatedBlocks.keys.filter { _ < cleanupThreshTime }.toSeq logInfo("Deleting batches " + timesToCleanup) if (writeToLog(BatchCleanupEvent(timesToCleanup))) { timeToAllocatedBlocks --= timesToCleanup writeAheadLogOption.foreach(_.clean(cleanupThreshTime.milliseconds, waitForCompletion)) } else { logWarning("Failed to acknowledge batch clean up in the Write Ahead Log.") } }
用来写Log的方法writeToLog的代码如下:
/** Write an update to the tracker to the write ahead log */ private def writeToLog(record: ReceivedBlockTrackerLogEvent): Boolean = { if (isWriteAheadLogEnabled) { logTrace(s"Writing record: $record") try { writeAheadLogOption.get.write(ByteBuffer.wrap(Utils.serialize(record)), clock.getTimeMillis()) true } catch { case NonFatal(e) => logWarning(s"Exception thrown while writing record: $record to the WriteAheadLog.", e) false } } else { true } }
二。DStream和JobScheduler容错安全性
以上是数据层面,数据层面默认对wal的写是开启的。接下来我们看业务逻辑层面和作业生成层面,他们用的是checkpoint,本质上checkpoint和wal是一样的,只是机制和读写数据的方式不同。
checkpoint一般会写在hdfs,且时间间隔是batchduration,(数据层面是只要有数据进来或销毁就进行wal),Job生成和完成都对当前状态进行checkpoint。
JobGenerator在generateJobs方法中产生job后,会发送消息DoCheckpoint(time, clearCheckpointDataLater = false):
/* Generate jobs and perform checkpoint for the given time
. /
private def generateJobs(time: Time) {
// Set the SparkEnv in this thread, so that job generation code can access the environment
// Example: BlockRDDs are created in this thread, and it needs to access BlockManager
// Update: This is probably redundant after threadlocal stuff in SparkEnv has been removed.
SparkEnv.set(ssc.env)
Try {
jobScheduler.receiverTracker.allocateBlocksToBatch(time) // allocate received blocks to batch
graph.generateJobs(time) // generate jobs using allocated block
} match {
case Success(jobs) =>
val streamIdToInputInfos = jobScheduler.inputInfoTracker.getInfo(time)
jobScheduler.submitJobSet(JobSet(time, jobs, streamIdToInputInfos))
case Failure(e) =>
jobScheduler.reportError(“Error generating jobs for time ” + time, e)
}
eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = false))
}
JobGenerator在清除DStream metadata时,也会发送消息DoCheckpoint(time, clearCheckpointDataLater = true):
/** Clear DStream metadata for the given `time`. */ private def clearMetadata(time: Time) { ssc.graph.clearMetadata(time) // If checkpointing is enabled, then checkpoint, // else mark batch to be fully processed if (shouldCheckpoint) { eventLoop.post(DoCheckpoint(time, clearCheckpointDataLater = true)) } else { // If checkpointing is not enabled, then delete metadata information about // received blocks (block data not saved in any case). Otherwise, wait for // checkpointing of this batch to complete. val maxRememberDuration = graph.getMaxInputStreamRememberDuration() jobScheduler.receiverTracker.cleanupOldBlocksAndBatches(time - maxRememberDuration) jobScheduler.inputInfoTracker.cleanup(time - maxRememberDuration) markBatchFullyProcessed(time) } }
JobGenerator收到DoCheckpoint消息后,会调用自己的doCheckpoint方法:
/* Processes all events /
private def processEvent(event: JobGeneratorEvent) {
logDebug(“Got event ” + event)
event match {
case GenerateJobs(time) => generateJobs(time)
case ClearMetadata(time) => clearMetadata(time)
case DoCheckpoint(time, clearCheckpointDataLater) =>
doCheckpoint(time, clearCheckpointDataLater)
case ClearCheckpointData(time) => clearCheckpointData(time)
}
}
JobGenerator(jobScheduler: JobScheduler)中的doCheckpoint方法:
/* Perform checkpoint for the give time
. /
private def doCheckpoint(time: Time, clearCheckpointDataLater: Boolean) {
if (shouldCheckpoint && (time - graph.zeroTime).isMultipleOf(ssc.checkpointDuration)) {
logInfo(“Checkpointing graph for time ” + time)
ssc.graph.updateCheckpointData(time)
checkpointWriter.write(new Checkpoint(ssc, time), clearCheckpointDataLater)
}
}
可以看到,checkpoint的真正操作是在DStreamGraph中的。
DStreamGraph:
def updateCheckpointData(time: Time) {
logInfo(“Updating checkpoint data for time ” + time)
this.synchronized {
outputStreams.foreach(_.updateCheckpointData(time))
}
logInfo(“Updated checkpoint data for time ” + time)
}
def clearCheckpointData(time: Time) {
logInfo(“Clearing checkpoint data for time ” + time)
this.synchronized {
outputStreams.foreach(_.clearCheckpointData(time))
}
logInfo(“Cleared checkpoint data for time ” + time)
}
DStreamGraph路由到DStream:
/**
* Refresh the list of checkpointed RDDs that will be saved along with checkpoint of
* this stream. This is an internal method that should not be called directly. This is
* a default implementation that saves only the file names of the checkpointed RDDs to
* checkpointData. Subclasses of DStream (especially those of InputDStream) may override
* this method to save custom checkpoint data.
*/
private[streaming] def updateCheckpointData(currentTime: Time) {
logDebug(“Updating checkpoint data for time ” + currentTime)
checkpointData.update(currentTime)
dependencies.foreach(_.updateCheckpointData(currentTime))
logDebug(“Updated checkpoint data for time ” + currentTime + “: ” + checkpointData)
}
private[streaming] def clearCheckpointData(time: Time) {
logDebug(“Clearing checkpoint data”)
checkpointData.cleanup(time)
dependencies.foreach(_.clearCheckpointData(time))
logDebug(“Cleared checkpoint data”)
}
JobGenerator中的shouldCheckpoint是怎么确定的呢?是由构建StreamingContext时的传入参数,即对象Checkpoint决定的:
// This is marked lazy so that this is initialized after checkpoint duration has been set // in the context and the generator has been started. private lazy val shouldCheckpoint = ssc.checkpointDuration != null && ssc.checkpointDir != null
Checkpoint类:
private[streaming]class Checkpoint(ssc: StreamingContext, val checkpointTime: Time) extends Logging with Serializable { val master = ssc.sc.master val framework = ssc.sc.appName val jars = ssc.sc.jars val graph = ssc.graph val checkpointDir = ssc.checkpointDir val checkpointDuration = ssc.checkpointDuration val pendingTimes = ssc.scheduler.getPendingTimes().toArray val delaySeconds = MetadataCleaner.getDelaySeconds(ssc.conf) val sparkConfPairs = ssc.conf.getAll
通过查看StreamingContext的构造方法可知,只有在构建StreamingContext的时候有路径的时候才会构建
Checkpoint对象,才能执行checkpoint操作。
/** * Recreate a StreamingContext from a checkpoint file. * @param path Path to the directory that was specified as the checkpoint directory * @param hadoopConf Optional, configuration object if necessary for reading from * HDFS compatible filesystems */ def this(path: String, hadoopConf: Configuration) = this(null, CheckpointReader.read(path, new SparkConf(), hadoopConf).get, null)
/**
* Recreate a StreamingContext from a checkpoint file using an existing SparkContext.
* @param path Path to the directory that was specified as the checkpoint directory
* @param sparkContext Existing SparkContext
*/
def this(path: String, sparkContext: SparkContext) = {
this(
sparkContext,
CheckpointReader.read(path, sparkContext.conf, sparkContext.hadoopConfiguration).get,
null)
}
本次分享来自于王家林老师的课程‘源码版本定制发行班’,在此向王家林老师表示感谢!
欢迎大家交流技术知识!一起学习,共同进步!
- Spark定制班第13课:Spark Streaming源码解读之Driver容错安全性
- 第13课:Spark Streaming源码解读之Driver容错安全性
- 第13课:Spark Streaming 源码解读之Driver 容错安全性
- 第13课:Spark Streaming源码解读之Driver容错安全性
- 第13课:Spark Streaming源码解读之Driver容错安全性
- 第13课:Spark Streaming源码解读之Driver容错安全性
- Spark Streaming源码解读之Driver容错安全性
- Spark Streaming源码解读之Driver容错安全性
- Spark Streaming源码解读之Driver容错安全性
- Spark定制班第12课:Spark Streaming源码解读之Executor容错安全性
- 13 Spark Streaming之Driver容错安全性
- Spark 定制版:013~Spark Streaming源码解读之Driver容错安全性
- 第12课:Spark Streaming源码解读之Executor容错安全性
- 第12课 :Spark Streaming源码解读之Executor容错安全性
- 第12课:Spark Streaming源码解读之executor容错安全性
- 第12课:Spark Streaming源码解读之Executor容错安全性
- Spark Streaming源码解读之Executor容错安全性
- Spark Streaming源码解读之Executor容错安全性
- R语言-代码规范(Google's R Style Guide)
- LeetCode--No.206--Reverse Linked List
- 5.2 网络训练
- x64通过PspCidTable遍历进程
- Ubuntu 安装开源微信(源码安装+release快速安装)
- 第13课:Spark Streaming源码解读之Driver容错安全性
- oracle错误代码大全
- Number of Islands, variation
- Ubuntu 16.04 卸载vmware
- linux下下安装jdk-6u45-linux-x64.bin
- startForeground()让服务保持前台级别
- LeetCode--No.328--Odd Even Linked List
- LeetCode--No.83--Remove Duplicates From Sorted List
- 【C代码】通过linux文件系统操作GPIO