Spark RPC阅读笔记(初稿)

来源:互联网 发布:mac如何批量删除照片 编辑:程序博客网 时间:2024/06/05 04:19

阅读代码的版本为2.0.0-SNAPSHOT。在最新的Spark源码中,RPC统一采用了Netty作为传输框架。主要由RpcEnv,RpcEndpointRpcEndpointRef组成,他们之间的关系图如下:
这里写图片描述

RpcEnv必须由RpcEnvFactory实现,RpcEnv可以理解一个容器,所有的RpcEndpoint都必须像RpcEnv注册,并得到对应的RpcEndpointRef,这样就可以由RpcEndpointRef 发送消息到RpcEnv,由RpcEnv找到对应的RpcEndpoint并对收到的消息进行处理和反馈。接下来简单介绍下这三个类:

RpcEnv

RpcEnv的类的注释如下:
An RPC environment. RpcEndpoints need to register itself will process messages sent from RpcEndpointRef or remote nodes, and deliver them to corresponding RpcEndpoints. For uncaught exceptions caught by RpcEnv, RpcEnv will use RpcCallContext.sendFailure to send exceptions back to the sender, or logging them if no such sender or to retrieve RpcEndpointRefs given name or uri.

大致意思就是上面我写的那段话,对于出现了错误,RpcEnv会将错误发回到Sender或者日志记录。

注册方法:

/**   Register a RpcEndpoint with a name and return its RpcEndpointRef. RpcEnvdoes not guarantee thread-safety.   */  def setupEndpoint(name: String, endpoint: RpcEndpoint): RpcEndpointRef

注册RpcEndpointRpcEnv,并返回对应的RpcEndpointRef.

类中还有一些获得对应RpcEndpointRef的方法。

RpcEndpoint

RpcEndpoint类的注释:
An end point for the RPC that defines what functions to trigger given a message. It is guaranteed that onStart, receive and onStop will be called in sequence. The life-cycle of an endpoint is: constructor -> onStart -> receive -> onStop

Note: receive can be called concurrently. If you want receive to be thread-safe, please use ThreadSafeRpcEndpoint. If any error is thrown from one of RpcEndpoint methods except onError, onError will be invoked with the cause. If onError throws an error, RpcEnv will ignore it.

RpcEndpoint定义了一系列对接受到的消息做出反应的方法.RpcEndpoint的内部的生命周期是constructor -> onStart -> receive -> onStop。主要的处理逻辑是根据在receive收到消息,然后调用相应的处理方法。
receive可以并发访问,Spark提供了ThreadSafeRpcEndpoint线程安全的版本。

RpcEndpointRef

RpcEndpointRef类的注释:
A reference for a remote RpcEndpoint. RpcEndpointRef is thread-safe.

可以理解为是RpcEndpoint的远程引用,内部的方法主要是发送消息的一些方法。

0 0