Spark源码学习笔记5-RpcEnv(Rpc抽象层)

来源:互联网 发布:数据分析师证书考试 编辑:程序博客网 时间:2024/06/05 17:22

继4-SparkEnv 之后,我们再来详细研究下SparkEnv中出现的一个核心RpcEnv。
首先看下RpcEnv.scala源文件:
主要有RpcEnv伴生对象和伴生类,伴生类为抽象类,主要是一些Rpc框架的接口函数的申明。伴生对象提供静态的create接口,调用具体的RpcEnv工厂(NettyRpcEnvFactory)创建RpcEnv。目前Spark(我使用的是2.1.0版本)的Rpc框架的实现使用的是Netty,已经没有Akka的实现了。

package org.apache.spark.rpc....../** * A RpcEnv implementation must have a [[RpcEnvFactory]] implementation with an empty constructor * so that it can be created via Reflection. */private[spark] object RpcEnv {  def create(      name: String,      host: String,      port: Int,      conf: SparkConf,      securityManager: SecurityManager,      clientMode: Boolean = false): RpcEnv = {    create(name, host, host, port, conf, securityManager, clientMode)  }  def create(      name: String,      bindAddress: String,      advertiseAddress: String,      port: Int,      conf: SparkConf,      securityManager: SecurityManager,      clientMode: Boolean): RpcEnv = {    val config = RpcEnvConfig(conf, name, bindAddress, advertiseAddress, port, securityManager,      clientMode)    new NettyRpcEnvFactory().create(config)  }}/** * An RPC environment. [[RpcEndpoint]]s need to register itself with a name to [[RpcEnv]] to * receives messages. Then [[RpcEnv]] will process messages sent from [[RpcEndpointRef]] or remote * nodes, and deliver them to corresponding [[RpcEndpoint]]s. For uncaught exceptions caught by * [[RpcEnv]], [[RpcEnv]] will use [[RpcCallContext.sendFailure]] to send exceptions back to the * sender, or logging them if no such sender or `NotSerializableException`. * * [[RpcEnv]] also provides some methods to retrieve [[RpcEndpointRef]]s given name or uri. */private[spark] abstract class RpcEnv(conf: SparkConf) {  private[spark] val defaultLookupTimeout = RpcUtils.lookupRpcTimeout(conf)  /**   * Return RpcEndpointRef of the registered [[RpcEndpoint]]. Will be used to implement   * [[RpcEndpoint.self]]. Return `null` if the corresponding [[RpcEndpointRef]] does not exist.   */  private[rpc] def endpointRef(endpoint: RpcEndpoint): RpcEndpointRef  /**   * Return the address that [[RpcEnv]] is listening to.   */  def address: RpcAddress  /**   * Register a [[RpcEndpoint]] with a name and return its [[RpcEndpointRef]]. [[RpcEnv]] does not   * guarantee thread-safety.   */  def setupEndpoint(name: String, endpoint: RpcEndpoint): RpcEndpointRef  /**   * Retrieve the [[RpcEndpointRef]] represented by `uri` asynchronously.   */  def asyncSetupEndpointRefByURI(uri: String): Future[RpcEndpointRef]  /**   * Retrieve the [[RpcEndpointRef]] represented by `uri`. This is a blocking action.   */  def setupEndpointRefByURI(uri: String): RpcEndpointRef = {    defaultLookupTimeout.awaitResult(asyncSetupEndpointRefByURI(uri))  }  /**   * Retrieve the [[RpcEndpointRef]] represented by `address` and `endpointName`.   * This is a blocking action.   */  def setupEndpointRef(address: RpcAddress, endpointName: String): RpcEndpointRef = {    setupEndpointRefByURI(RpcEndpointAddress(address, endpointName).toString)  }  /**   * Stop [[RpcEndpoint]] specified by `endpoint`.   */  def stop(endpoint: RpcEndpointRef): Unit  /**   * Shutdown this [[RpcEnv]] asynchronously. If need to make sure [[RpcEnv]] exits successfully,   * call [[awaitTermination()]] straight after [[shutdown()]].   */  def shutdown(): Unit  /**   * Wait until [[RpcEnv]] exits.   *   * TODO do we need a timeout parameter?   */  def awaitTermination(): Unit  /**   * [[RpcEndpointRef]] cannot be deserialized without [[RpcEnv]]. So when deserializing any object   * that contains [[RpcEndpointRef]]s, the deserialization codes should be wrapped by this method.   */  def deserialize[T](deserializationAction: () => T): T  /**   * Return the instance of the file server used to serve files. This may be `null` if the   * RpcEnv is not operating in server mode.   */  def fileServer: RpcEnvFileServer  /**   * Open a channel to download a file from the given URI. If the URIs returned by the   * RpcEnvFileServer use the "spark" scheme, this method will be called by the Utils class to   * retrieve the files.   *   * @param uri URI with location of the file.   */  def openChannel(uri: String): ReadableByteChannel}

在RpcEnv.scala源文件中除了RpcEnv的伴生对象和伴生类外,还有一个样例类RpcEnvConfig和一个trait RpcEnvFileServer。RpcEnvConfig的实例作为参数传递给具体的RpcEnvFactory的create函数创建RpcEnv实例:

private[spark] case class RpcEnvConfig(    conf: SparkConf,    name: String,    bindAddress: String,    advertiseAddress: String,    port: Int,    securityManager: SecurityManager,    clientMode: Boolean)/** * A server used by the RpcEnv to server files to other processes owned by the application. * * The file server can return URIs handled by common libraries (such as "http" or "hdfs"), or * it can return "spark" URIs which will be handled by `RpcEnv#fetchFile`. */private[spark] trait RpcEnvFileServer {  /**   * Adds a file to be served by this RpcEnv. This is used to serve files from the driver   * to executors when they're stored on the driver's local file system.   ......   */  def addFile(file: File): String  /**   * Adds a jar to be served by this RpcEnv.    ......   */  def addJar(file: File): String  /**   * Adds a local directory to be served via this file server.   * ......   */  def addDirectory(baseUri: String, path: File): String  /** Validates and normalizes the base URI for directories. */  protected def validateDirectoryUri(baseUri: String): String = {    val fixedBaseUri = "/" + baseUri.stripPrefix("/").stripSuffix("/")    require(fixedBaseUri != "/files" && fixedBaseUri != "/jars",      "Directory URI cannot be /files nor /jars.")    fixedBaseUri  }}

RpcEnv.scala文件中已涉及类NettyRpcEnvFactory, RpcEndpoint, RpcEndpointRef。除了NettyRpcEnvFactory是Rpc框架实现层的类外,RpcEndpoint,RpcEndpointRef以及RpcEnv都是Rpc框架抽象层的类,这一节我们主要了解抽象层的类。

RpcEndpoint所在文件为RpcEndpoint.scala,该文件包含trait RpcEnvFactory的定义:

/** * A factory class to create the [[RpcEnv]]. It must have an empty constructor so that it can be * created using Reflection. */private[spark] trait RpcEnvFactory {  def create(config: RpcEnvConfig): RpcEnv}

然后主要是trait RpcEndpoint的定义,RpcEndpoint是Rpc通讯的端点,其trait包含启动/接受消息/停止/连接状态变化的处理等函数的default实现:

/** * An end point for the RPC that defines what functions to trigger given a message. * * It is guaranteed that `onStart`, `receive` and `onStop` will be called in sequence. * * The life-cycle of an endpoint is: * * constructor -> onStart -> receive* -> onStop * * Note: `receive` can be called concurrently. If you want `receive` to be thread-safe, please use * [[ThreadSafeRpcEndpoint]] * * If any error is thrown from one of [[RpcEndpoint]] methods except `onError`, `onError` will be * invoked with the cause. If `onError` throws an error, [[RpcEnv]] will ignore it. */private[spark] trait RpcEndpoint {  /**   * The [[RpcEnv]] that this [[RpcEndpoint]] is registered to.   */  val rpcEnv: RpcEnv  /**   * The [[RpcEndpointRef]] of this [[RpcEndpoint]]. `self` will become valid when `onStart` is   * called. And `self` will become `null` when `onStop` is called.   *   * Note: Because before `onStart`, [[RpcEndpoint]] has not yet been registered and there is not   * valid [[RpcEndpointRef]] for it. So don't call `self` before `onStart` is called.   */  final def self: RpcEndpointRef = {    require(rpcEnv != null, "rpcEnv has not been initialized")    rpcEnv.endpointRef(this)  }  /**   * Process messages from [[RpcEndpointRef.send]] or [[RpcCallContext.reply)]]. If receiving a   * unmatched message, [[SparkException]] will be thrown and sent to `onError`.   */  def receive: PartialFunction[Any, Unit] = {    case _ => throw new SparkException(self + " does not implement 'receive'")  }  /**   * Process messages from [[RpcEndpointRef.ask]]. If receiving a unmatched message,   * [[SparkException]] will be thrown and sent to `onError`.   */  def receiveAndReply(context: RpcCallContext): PartialFunction[Any, Unit] = {    case _ => context.sendFailure(new SparkException(self + " won't reply anything"))  }  /**   * Invoked when any exception is thrown during handling messages.   */  def onError(cause: Throwable): Unit = {    // By default, throw e and let RpcEnv handle it    throw cause  }  /**   * Invoked when `remoteAddress` is connected to the current node.   */  def onConnected(remoteAddress: RpcAddress): Unit = {    // By default, do nothing.  }  /**   * Invoked when `remoteAddress` is lost.   */  def onDisconnected(remoteAddress: RpcAddress): Unit = {    // By default, do nothing.  }  /**   * Invoked when some network error happens in the connection between the current node and   * `remoteAddress`.   */  def onNetworkError(cause: Throwable, remoteAddress: RpcAddress): Unit = {    // By default, do nothing.  }  /**   * Invoked before [[RpcEndpoint]] starts to handle any message.   */  def onStart(): Unit = {    // By default, do nothing.  }  /**   * Invoked when [[RpcEndpoint]] is stopping. `self` will be `null` in this method and you cannot   * use it to send or ask messages.   */  def onStop(): Unit = {    // By default, do nothing.  }  /**   * A convenient method to stop [[RpcEndpoint]].   */  final def stop(): Unit = {    val _self = self    if (_self != null) {      rpcEnv.stop(_self)    }  }}/** * A trait that requires RpcEnv thread-safely sending messages to it. * ...... */private[spark] trait ThreadSafeRpcEndpoint extends RpcEndpoint

RpcEndpointRef所在文件为RpcEndpointRef.scala, 该文件仅包含RpcEndpointRef这个类,该类是一个抽象类,它是对一个远程RpcEndpoint的引用,通过一些接口与远端RpcEndpoint通讯。它与RpcEndpoint一样也是Rpc框架抽象层的类,定义如下:

package org.apache.spark.rpc....../** * A reference for a remote [[RpcEndpoint]]. [[RpcEndpointRef]] is thread-safe. */private[spark] abstract class RpcEndpointRef(conf: SparkConf)  extends Serializable with Logging {  private[this] val maxRetries = RpcUtils.numRetries(conf)  private[this] val retryWaitMs = RpcUtils.retryWaitMs(conf)  private[this] val defaultAskTimeout = RpcUtils.askRpcTimeout(conf)  /**   * return the address for the [[RpcEndpointRef]]   */  def address: RpcAddress  def name: String  /**   * Sends a one-way asynchronous message. Fire-and-forget semantics.   */  def send(message: Any): Unit  /**   * Send a message to the corresponding [[RpcEndpoint.receiveAndReply)]] and return a [[Future]] to   * receive the reply within the specified timeout.   *   * This method only sends the message once and never retries.   */  def ask[T: ClassTag](message: Any, timeout: RpcTimeout): Future[T]  /**   * Send a message to the corresponding [[RpcEndpoint.receiveAndReply)]] and return a [[Future]] to   * receive the reply within a default timeout.   *   * This method only sends the message once and never retries.   */  def ask[T: ClassTag](message: Any): Future[T] = ask(message, defaultAskTimeout)  /**   * Send a message to the corresponding [[RpcEndpoint]] and get its result within a default   * timeout, or throw a SparkException if this fails even after the default number of retries.   * The default `timeout` will be used in every trial of calling `sendWithReply`. Because this   * method retries, the message handling in the receiver side should be idempotent.   *   * Note: this is a blocking action which may cost a lot of time,  so don't call it in a message   * loop of [[RpcEndpoint]].   *   * @param message the message to send   * @tparam T type of the reply message   * @return the reply message from the corresponding [[RpcEndpoint]]   */  def askWithRetry[T: ClassTag](message: Any): T = askWithRetry(message, defaultAskTimeout)  /**   * Send a message to the corresponding [[RpcEndpoint.receive]] and get its result within a   * specified timeout, throw a SparkException if this fails even after the specified number of   * retries. `timeout` will be used in every trial of calling `sendWithReply`. Because this method   * retries, the message handling in the receiver side should be idempotent.   *   * Note: this is a blocking action which may cost a lot of time, so don't call it in a message   * loop of [[RpcEndpoint]].   *   * @param message the message to send   * @param timeout the timeout duration   * @tparam T type of the reply message   * @return the reply message from the corresponding [[RpcEndpoint]]   */  def askWithRetry[T: ClassTag](message: Any, timeout: RpcTimeout): T = {    // TODO: Consider removing multiple attempts    var attempts = 0    var lastException: Exception = null    while (attempts < maxRetries) {      attempts += 1      try {        val future = ask[T](message, timeout)        val result = timeout.awaitResult(future)        if (result == null) {          throw new SparkException("RpcEndpoint returned null")        }        return result      } catch {        case ie: InterruptedException => throw ie        case e: Exception =>          lastException = e          logWarning(s"Error sending message [message = $message] in $attempts attempts", e)      }      if (attempts < maxRetries) {        Thread.sleep(retryWaitMs)      }    }    throw new SparkException(      s"Error sending message [message = $message]", lastException)  }}
0 0
原创粉丝点击