spark 2.1 RDD persist process
来源:互联网 发布:移动数据 编辑:程序博客网 时间:2024/05/20 18:46
/** * Persist this RDD with the default storage level (`MEMORY_ONLY`). */ def persist(): this.type = persist(StorageLevel.MEMORY_ONLY)
/** * Set this RDD's storage level to persist its values across operations after the first time * it is computed. This can only be used to assign a new storage level if the RDD does not * have a storage level set yet. Local checkpointing is an exception. */ def persist(newLevel: StorageLevel): this.type = { if (isLocallyCheckpointed) { // This means the user previously called localCheckpoint(), which should have already // marked this RDD for persisting. Here we should override the old storage level with // one that is explicitly requested by the user (after adapting it to use disk). persist(LocalRDDCheckpointData.transformStorageLevel(newLevel), allowOverride = true) } else { persist(newLevel, allowOverride = false) } }
persist(newLevel: StorageLevel, allowOverride: Boolean)
/** * Mark this RDD for persisting using the specified level. * * @param newLevel the target storage level * @param allowOverride whether to override any existing level with the new one */ private def persist(newLevel: StorageLevel, allowOverride: Boolean): this.type = { // TODO: Handle changes of StorageLevel if (storageLevel != StorageLevel.NONE && newLevel != storageLevel && !allowOverride) { throw new UnsupportedOperationException( "Cannot change storage level of an RDD after it was already assigned a level") } // If this is the first time this RDD is marked for persisting, register it // with the SparkContext for cleanups and accounting. Do this only once. if (storageLevel == StorageLevel.NONE) { sc.cleaner.foreach(_.registerRDDForCleanup(this)) sc.persistRDD(this) } storageLevel = newLevel this }
/** * Register an RDD to be persisted in memory and/or disk storage */ private[spark] def persistRDD(rdd: RDD[_]) { persistentRdds(rdd.id) = rdd }
阅读全文
0 0
- spark 2.1 RDD persist process
- spark 2.1 RDD compute process
- Spark RDD的缓存 rdd.cache() 和 rdd.persist()
- Spark RDD的缓存 rdd.cache() 和 rdd.persist()
- 走进spark(二) rdd.persist
- Spark RDD中cache和persist的区别
- Spark RDD中cache和persist的区别
- spark源码之RDD(1)partition、dependence、persist
- Spark storage系列------1.Spark RDD.persist对数据的存储
- 【spark RDD】RDD编程
- Spark/RDD
- Spark-rdd
- spark RDD
- Spark RDD
- Spark RDD
- spark rdd
- Spark RDD
- Spark rdd
- Apache中的error.log文件过大占用内存
- 【51Nod】1080 两个数的平方和
- Django Error 集锦
- Android 静默卸载指定包名APP
- Visual studio is waiting for internal operation to complete
- spark 2.1 RDD persist process
- spring4
- jd-eclipse 的安装和使用
- echo 变色
- Java学习笔记_14
- Currying是什么
- 为什么jdk中把String类设计成final?
- hadoop集群System Cpu消耗过高问题分析--内存碎片整合问题
- react + redux 完整的项目