spark core 2.0 Partition and HadoopPartition
来源:互联网 发布:约翰特拉沃尔塔 知乎 编辑:程序博客网 时间:2024/04/29 21:16
Spark Partition is a trait.
/** * An identifier for a partition in an RDD. */trait Partition extends Serializable { /** * Get the partition's index within its parent RDD */ def index: Int // A better default implementation of HashCode override def hashCode(): Int = index override def equals(other: Any): Boolean = super.equals(other)}
/** * A Spark split class that wraps around a Hadoop InputSplit. */private[spark] class HadoopPartition(rddId: Int, override val index: Int, s: InputSplit) extends Partition { val inputSplit = new SerializableWritable[InputSplit](s) override def hashCode(): Int = 31 * (31 + rddId) + index override def equals(other: Any): Boolean = super.equals(other) /** * Get any environment variables that should be added to the users environment when running pipes * @return a Map with the environment variables and corresponding values, it could be empty */ def getPipeEnvVars(): Map[String, String] = { val envVars: Map[String, String] = if (inputSplit.value.isInstanceOf[FileSplit]) { val is: FileSplit = inputSplit.value.asInstanceOf[FileSplit] // map_input_file is deprecated in favor of mapreduce_map_input_file but set both // since it's not removed yet Map("map_input_file" -> is.getPath().toString(), "mapreduce_map_input_file" -> is.getPath().toString()) } else { Map() } envVars }}
1 0
- spark core 2.0 Partition and HadoopPartition
- spark core 2.0 BlockInfo And BlockInfoManager
- saprk core 2.0 Partition CheckpointRDDPartition
- 【Spark】worker、executor、core、stage、task、partition概念
- spark core 2.0 SortShuffleManager
- spark core 2.0 OutputCommitCoordinator
- spark core 2.0 LiveListenerBus
- spark core 2.0 JobProgressListener
- spark core 2.0 YarnClusterManager
- spark core 2.0 YarnClusterSchedulerBackend
- spark core 2.0 Executor
- spark core 2.0 MetricsConfig
- spark core 2.0 ContextCleaner
- spark core 2.0 TransportClientFactory
- spark core 2.0 DiskBlockManager
- spark core 2.0 DiskStore
- spark core 2.0 StorageMemoryPool
- spark core 2.0 ChunkedByteBufferOutputStream
- webpack 配置
- ffmpeg之多个MP4视频合并一个MP4视频文件(附遇见的bug)
- API实用设计规范
- ndk开发中遇到的问题
- Daily Learning
- spark core 2.0 Partition and HadoopPartition
- 收集一个测试工程师网站--测试工具,自动化等。
- 使用Eclipse构建Maven的SpringMVC项目
- Xamarin.iOS 基础控件基本用法总结
- java的异常处理。。。java知识总结(工作一年半差不多两年了,感觉是时候总结一下java,文章内容为本人观点)
- 使用TFHpple解析html
- UVA 1218 树形DP
- http状态码
- 【Pandas-Cookbook】08:时间戳处理