Hadoop源代码分析之Hadoop RPC(RPC和Client)

来源：互联网发布：淘宝中年女模特叫什么编辑：程序博客网时间：2024/05/20 18:53

这里从一个Hadoop RPC的使用例子入手
首先定义一个服务器需要发布的接口，供客户端远程调用，这个接口必须继承VersionedProtocol 接口，就和RMI的远程即可必须继承Remote接口一样，但VersionedProtocol 有一个getProtocolVersion()方法，该方法有两个参数，分别是协议接口对应的接口名称protocol和客户端期望的服务版本号clientVersion，方法返回服务器端的接口实现版本号，这个方法用于检查通信的双方，保证它们使用了相同版本的接口
public interface RpcCall extends VersionedProtocol {
     public String execute(String name) ;
}
这个接口的一个简单实现如下:
public class RpcCallImpl implements RpcCall{
public String execute(String name){
  System.out.println("我接收到: " + name);
  return "success";
}
public long getProtocolVersion(String protocol, long clientVersion)
   throws IOException {
  return 0;
}

}
下面利用Hadoop的IPC建立一个服务器，用于发布这个服务(接口)
public static void main(String args[]) throws InterruptedException {
  try {
   Server server = RPC.getServer(new RpcCallImpl(), "localhost", 9999,
     new Configuration());
   server.start();

   Thread.sleep(60000);
  } catch (IOException e) {
   e.printStackTrace();
  }
}
再看客户端的使用
public static void main(String[] args) {
  try {
    //创建一个代理类，由这个代理类来完成相关的调用过程
   RpcCall call = (RpcCall) RPC.getProxy(RpcCall.class, 0, new InetSocketAddress("localhost",9999), new Configuration());
   String abc = call.execute("Hello,World!");//是不是看起来调用很简单呢，但是这里面可是封装了与服务器通信的各种细节哦
   System.out.println("Client: " + abc);//这样就实现了访问远程服务器接口，并拿到返回值
   RPC.stopProxy(call);
  } catch (IOException e) {
   e.printStackTrace();
  }
}
上面的代码就完成了一个简单的RPC调用，是不是和java rmi差不多呢，呵呵
从上面可以可以看出服务器端主要是调用了 RPC.getServer(new RpcCallImpl(), "localhost", 9999,
new Configuration());//这个方法主要是创建了一个Server对象，然后把要发布的变量保存在Server的instance中，然后就是启动Server，这里主要是启动Listener，Handler，Responder，这三个类前面已经介绍过，这里就不在多少了，下面主要看下客户端的调用过程

客户端的调用首先创建一个代理对象，通过RPC.getProxy()方法，这个方法的实现如下:
public static VersionedProtocol getProxy(Class<?> protocol,
      long clientVersion, InetSocketAddress addr, UserGroupInformation ticket,
      Configuration conf, SocketFactory factory) throws IOException {

    VersionedProtocol proxy =
        (VersionedProtocol) Proxy.newProxyInstance( //创建这个接口的代理对象
            protocol.getClassLoader(), new Class[] { protocol },
            new Invoker(addr, ticket, conf, factory));//Invoker实现了InvocationHandler
    ...
}
由java动态代理可知，使用这个代理类调用任何方法都会调用InvocationHandler的invoke方法，来看下这个方法
public Object invoke(Object proxy, Method method, Object[] args)
      throws Throwable {
      ...

      ObjectWritable value = (ObjectWritable)
        client.call(new Invocation(method, args), address,
                    method.getDeclaringClass(), ticket);
     ...
      return value.get();
    }
由此可知最终调用了client.call()方法,在这里使用了wait/notify机制把异步的调用转化成同步调用
public Writable call(Writable param, InetSocketAddress addr,
                       Class<?> protocol, UserGroupInformation ticket)
                       throws InterruptedException, IOException {
    Call call = new Call(param);
    Connection connection = getConnection(addr, protocol, ticket, call);
    connection.sendParam(call);                 // 把这个call发送给服务器，把相关数据写入到Connection.socket的输入流中
    boolean interrupted = false;
    synchronized (call) {
      while (!call.done) {
        try {
          call.wait();                           // 等待调用结果返回
        } catch (InterruptedException ie) {
          // save the fact that we were interrupted
          interrupted = true;
        }
      }

     ...
       } else {
        return call.value;
      }
    }
}
由上面可以知道与服务器通信的细节都封装在了Connection中,注意这里的Connection(这是一个线程，run方法循环读取服务器发来的响应)和服务器的Connection(没有实现线程)不同，所有的Connection都会保存在以ConnectionId(由address,ticket,protocol三部分组成)为键的HashMap中，这样对于同一ConnectionId的Connection将会得到复用，减少了不必要的Connection创建
private Connection getConnection(InetSocketAddress addr,
                                   Class<?> protocol,
                                   UserGroupInformation ticket,
                                   Call call)
                                   throws IOException {
    ...
    Connection connection;
    ConnectionId remoteId = new ConnectionId(addr, protocol, ticket);
    do {
      synchronized (connections) {
        connection = connections.get(remoteId);
        if (connection == null) {
          connection = new Connection(remoteId);
          connections.put(remoteId, connection);
        }
      }
    } while (!connection.addCall(call));
   ...
    connection.setupIOstreams();//初始化Socket的输入输出流,不明白这里为什么不用NIO了？
    return connection;
}

然后就是Connection线程的run方法咯，run方法主要是不断的调用receiveResponse()方法，下面来看一看
private void receiveResponse() {
      ...
      touch();
      try {
        int id = in.readInt();                    // try to read an id

...

Call call = calls.get(id);

        int state = in.readInt();     // read call status
        if (state == Status.SUCCESS.state) {
          Writable value = ReflectionUtils.newInstance(valueClass, conf);//根据不同的Class创建不同的Writable，默认是ObjectWritable
          value.readFields(in);                 // 从服务端接收数据
          call.setValue(value);                //通知之前的调用已经调用完成，这里会调用call.notify()方法，通知程序继续执行
          calls.remove(id);
        } else if (state == Status.ERROR.state) {
          call.setException(new RemoteException(WritableUtils.readString(in),
                                                WritableUtils.readString(in)));
        } else if (state == Status.FATAL.state) {
          // Close the connection
          markClosed(new RemoteException(WritableUtils.readString(in),
                                         WritableUtils.readString(in)));
        }
      } catch (IOException e) {
        markClosed(e);
      }
    }
至此一个Hadoop RPC调用完成,从这写源代码可以看出Hadoop的各个对象各司其职，分工明确，其中大量的设计方法与设计模式都是值得我们学习的。请多多支持我们的网站中国大姨夫

0 0