Spark 2.0 streaming中 InputDStream 和 ReceiverInputDStream 的区别
来源:互联网 发布:网络舆论对社会的影响 编辑:程序博客网 时间:2024/05/21 06:30
InputStream是所有输入流的抽象基类。 这个类提供start()和stop()方法,这两个方法被Spark 流系统用来启动开始接收数据和停止接收数据。
如果输入流仅在运行在driver结点的服务和线程产生的新数据生成RDD,那么可能直接继承这个InputStream。以FileINputDStream为例,它是InputStream的一个子类,在driver端监控一个HDFS目录,如果有新文件生成,则用新文件生成RDDs。
如果实现的输入流需要在工作结点运行一个接收器,使用[[org.apache.spark.streaming.dstream.ReceiverInputDStream]]作为父类。
ReceiverInputDStream是一个抽像类,用于需要在工作结点启动一个接收器来接收外部数据。 ReceiverInputDStream的具体实现必须定义getReceiver方法来得到一个[[org.apache.spark.streaming.receiver.Receiver]] 对象。Receiver对象会发送到工作结点来接收数据。
/** * This is the abstract base class for all input streams. This class provides methods * start() and stop() which are called by Spark Streaming system to start and stop * receiving data, respectively. * Input streams that can generate RDDs from new data by running a service/thread only on * the driver node (that is, without running a receiver on worker nodes), can be * implemented by directly inheriting this InputDStream. For example, * FileInputDStream, a subclass of InputDStream, monitors a HDFS directory from the driver for * new files and generates RDDs with the new files. For implementing input streams * that requires running a receiver on the worker nodes, use * [[org.apache.spark.streaming.dstream.ReceiverInputDStream]] as the parent class. * * @param _ssc Streaming context that will execute this input stream */abstract class InputDStream[T: ClassTag](_ssc: StreamingContext) extends DStream[T](_ssc) {
/** * Abstract class for defining any [[org.apache.spark.streaming.dstream.InputDStream]] * that has to start a receiver on worker nodes to receive external data. * Specific implementations of ReceiverInputDStream must * define [[getReceiver]] function that gets the receiver object of type * [[org.apache.spark.streaming.receiver.Receiver]] that will be sent * to the workers to receive data. * @param _ssc Streaming context that will execute this input stream * @tparam T Class type of the object of this stream */abstract class ReceiverInputDStream[T: ClassTag](_ssc: StreamingContext) extends InputDStream[T](_ssc) {
1 0
- Spark 2.0 streaming中 InputDStream 和 ReceiverInputDStream 的区别
- Sparkstreaming中InputDStream的详解(源码)
- Spark Streaming和Storm的区别和联系
- Spark Streaming 和 Flume-NG的整合
- Spark Streaming的数据可靠性和一致性
- Spark Streaming和Flume的结合使用
- Spark-Streaming中累加
- spark-streaming-kafka-0-8 和 0-10的使用区别
- Spark Streaming的WindowedDStream
- spark中fatMap和Map的区别
- spark中flatmap和map的区别
- Spark中cache和persist的区别
- Spark中map和flatmap的区别
- Spark中RpcEnv和SparkEnv的区别
- Spark中ml和mllib的区别
- spark 中map 和flatmap 的区别
- Spark中persist和cache的区别
- Spark中foreachPartition和mapPartitions的区别
- CSDN博客的RSS订阅---使用foxmail订阅
- python3.4之决策树
- POJ-2251 Dungeon Master(BFS)
- js栏目根据地址栏修改背景添加class
- Java中日期的表示
- Spark 2.0 streaming中 InputDStream 和 ReceiverInputDStream 的区别
- MPAndroidChart代码套路记录
- VS2013 禁用与卸载插件
- Word 2010 打开文档,鼠标无法选中或者点击
- [kuangbin带你飞]专题十 匹配问题 F - Rain on your Parade(二分图匹配)(HKmatch)
- HTML笔记
- 分布式缓存--序列4--缓存更新策略/缓存穿透/缓存雪崩
- mean.binaryproto未找到的解决办法
- *计算机学习笔记杂项