Hadoop RPC实现NIO通信Server端剖析

来源：互联网发布：js字符串首字母大写编辑：程序博客网时间：2024/06/05 19:29

关于Hadoop的RPC的实现原理，下面进行了概括性介绍请参考

http://blog.csdn.net/xhh198781/article/details/7268298 Hadoop中的RPC实现——客户端通信组件http://blog.csdn.net/xhh198781/article/details/7280084 Hadoop中的RPC实现——服务端通信组件

从hadoop分离出的rpc代码见：http://download.csdn.net/detail/lzlchangqi/8182999

关于NIO的使用，我们结合这篇文章进行学习

http://book.51cto.com/art/201312/422046.htm

Acceptor：接受来自Client的连接，建立与Client对应的Handler，并向Reactor注册此Handler。
Handler：与一个Client通信的实体，并按一定的过程实现业务的处理。Handler内部往往会有更进一步的层次划分，用来抽象诸如read、decode、compute、encode和send等过程。在Reactor模式中，业务逻辑被分散的I/O事件所打破，所以Handler需要有适当的机制在所需的信息还不全（读到一半）的时候保存上下文，并在下一次I/O事件到来的时候（另一半可读）能继续上次中断的处理。
Reader/Sender：为了加速处理速度，Reactor模式往往构建一个存放数据处理线程的线程池，这样数据读出后，立即扔到线程池中等待后续处理即可。为此，Reactor模式一般分离Handler中的读和写两个过程，分别注册成单独的读事件和写事件，并由对应的Reader和Sender线程处理。
ipc.Server实际上实现了一个典型的Reactor设计模式，其整体架构与上述完全一致。一旦读者了解典型Reactor架构便可很容易地学习ipc.Server的设计思路及实现。接下来，我们分析ipc.Server的实现细节。
前面提到，ipc.Server的主要功能是接收来自客户端的RPC请求，经过调用相应的函数获取结果后，返回给对应的客户端。为此，ipc.Server被划分成3个阶段：接收请求、处理请求和返回结果，如图所示(来自51CTO)。各阶段实现细节如下：

（1）接收请求该阶段主要任务是接收来自各个客户端的RPC请求，并将它们封装成固定的格式（Call类）放到一个共享队列（callQueue）中，以便进行后续处理。该阶段内部又分为建立连接和接收请求两个子阶段，分别由Listener和Reader两种线程完成。整个Server只有一个Listener线程，统一负责监听来自客户端的连接请求，一旦有新的请求到达，它会采用轮询的方式从线程池中选择一个Reader线程进行处理，而Reader线程可同时存在多个，它们分别负责接收一部分客户端连接的RPC请求，至于每个Reader线程负责哪些客户端连接，完全由Listener决定，当前Listener只是采用了简单的轮询分配机制。

Listener和Reader线程内部各自包含一个Selector对象，分别用于监听SelectionKey.OP_ACCEPT和SelectionKey.OP_READ事件。对于Listener线程，主循环的实现体是监听是否有新的连接请求到达，并采用轮询策略选择一个Reader线程处理新连接；对于Reader线程，主循环的实现体是监听（它负责的那部分）客户端连接中是否有新的RPC请求到达，并将新的RPC请求封装成Call对象，放到共享队列callQueue中。到此还没遇到SelectionKey.OP_WRITE事件，不急往后看。

重点1：见 Listener.java的构造函数中的acceptChannel.register(selector, SelectionKey.OP_ACCEPT);

重点2：

1、见Listener.java ->run()->

if (key.isValid()) {

if (.isAcceptable())

doAccept(key); }

}

2、->doAccept()->registerChannel

public synchronized SelectionKey registerChannel(SocketChannel channel) throws IOException {

return channel.register(readSelector, SelectionKey.OP_READ);

}

重点3：在Listerner.java的 doAccept()中，

1、进行连接连理后的severchannel初始化参数设置，这类似于前面讲的例子NIOServer.java

try {          reader.startAdd();          SelectionKey readKey = reader.registerChannel(channel);          c = new Connection(readKey, channel, System.currentTimeMillis());          readKey.attach(c);          synchronized (connectionList) {            connectionList.add(numConnections, c);            numConnections++;          }          if (LOG.isDebugEnabled())            LOG.debug("Server connection from " + c.toString() +                "; # active connections: " + numConnections +                "; # queued calls: " + callQueue.size());                  } finally {          reader.finishAdd();         }

2、Listerner做的工作

1）wakeup readerselector；

2）注册/SelectionKey.OP_READ事件；

3）new Connection，暂存在connectionList中

因此Listener管理了每一个Client连接。

重点4：、学会如何关闭连接，Listener.java的run()方法中

synchronized (this) {        try {          acceptChannel.close();          selector.close();        } catch (IOException e) { }        selector= null;        acceptChannel= null;                // clean up all connections        while (!connectionList.isEmpty()) {          closeConnection(connectionList.remove(0));        }      }

下面关于Reader.java

重点5：先看Reader线程的run方法

public void run() {        LOG.info("Starting SocketReader");        synchronized (this) {          while (running) {            SelectionKey key = null;            try {              readSelector.select();              while (adding) {                this.wait(1000);              }                            Iterator<SelectionKey> iter = readSelector.selectedKeys().iterator();              while (iter.hasNext()) {                key = iter.next();                iter.remove();                if (key.isValid()) {                  if (key.isReadable()) {                    doRead(key);                  }                }                key = null;              }            } catch (InterruptedException e) {              if (running) {                      // unexpected -- log it                LOG.info(getName() + " caught: " +                         StringUtils.stringifyException(e));              }            } catch (IOException ex) {              LOG.error("Error in Reader", ex);            }          }        }      }

在Reader.java的run()中

1、doRead()方法正是在Listener.java的注册SelectionKey.OP_READ事件作用下，触发读事件

2、注意adding的使用

volatile boolean adding = false;while (adding) {  this.wait(1000);}public void startAdd() {        adding = true;        readSelector.wakeup();      }public synchronized void finishAdd() {        adding = false;        this.notify();              }

Listener和Reader代码如下：

个人认为：Listener和Reader所起的作用相当于BIO例子中的WorkThread，只不过WorkThread 启动几个线程就会启动数据的监听，而使用Listener只有一个线程来监听。Reader所起的作用起到了Listerner和Handler之间的解耦左右，Reader后完全不用调用handler，而是当道callQueue中

/** Listens on the socket. Creates jobs for the handler threads*/  private class Listener extends Thread {        private ServerSocketChannel acceptChannel = null; //the accept channel    private Selector selector = null; //the selector that we use for the server    private Reader[] readers = null;    private int currentReader = 0;    private InetSocketAddress address; //the address we bind at    private Random rand = new Random();    private long lastCleanupRunTime = 0; //the last time when a cleanup connec-                                         //-tion (for idle connections) ran    private long cleanupInterval = 10000; //the minimum interval between                                           //two cleanup runs    private int backlogLength = conf.getInt("ipc.server.listen.queue.size", 128);    private ExecutorService readPool;        public Listener() throws IOException {      address = new InetSocketAddress(bindAddress, port);      // Create a new server socket and set to non blocking mode      acceptChannel = ServerSocketChannel.open();      acceptChannel.configureBlocking(false);      // Bind the server socket to the local host and port      bind(acceptChannel.socket(), address, backlogLength);      port = acceptChannel.socket().getLocalPort(); //Could be an ephemeral port      // create a selector;      selector= Selector.open();      readers = new Reader[readThreads];      readPool = Executors.newFixedThreadPool(readThreads);      for (int i = 0; i < readThreads; i++) {//<span style="font-size: 13.3333339691162px;">它会采用轮询的方式从线程池中选择一个Reader线程进行处理，而Reader线程可同时存在多个</span>        Selector readSelector = Selector.open();        Reader reader = new Reader(readSelector);        readers[i] = reader;        readPool.execute(reader);      }      // Register accepts on the server socket with the selector.      acceptChannel.register(selector, SelectionKey.OP_ACCEPT);      this.setName("IPC Server listener on " + port);      this.setDaemon(true);    }        private class Reader implements Runnable {      private volatile boolean adding = false;      private Selector readSelector = null;      Reader(Selector readSelector) {        this.readSelector = readSelector;      }      public void run() {        LOG.info("Starting SocketReader");        synchronized (this) {          while (running) {            SelectionKey key = null;            try {              readSelector.select();              while (adding) {                this.wait(1000);              }                            Iterator<SelectionKey> iter = readSelector.selectedKeys().iterator();              while (iter.hasNext()) {                key = iter.next();                iter.remove();                if (key.isValid()) {                  if (key.isReadable()) {                    doRead(key);                  }                }                key = null;              }            } catch (InterruptedException e) {              if (running) {                      // unexpected -- log it                LOG.info(getName() + " caught: " +                         StringUtils.stringifyException(e));              }            } catch (IOException ex) {              LOG.error("Error in Reader", ex);            }          }        }      }      /**       * This gets reader into the state that waits for the new channel       * to be registered with readSelector. If it was waiting in select()       * the thread will be woken up, otherwise whenever select() is called       * it will return even if there is nothing to read and wait       * in while(adding) for finishAdd call       */      public void startAdd() {        adding = true;        readSelector.wakeup();      }            public synchronized SelectionKey registerChannel(SocketChannel channel)                                                          throws IOException {          return channel.register(readSelector, SelectionKey.OP_READ);      }      public synchronized void finishAdd() {        adding = false;        this.notify();              }    }    /** cleanup connections from connectionList. Choose a random range     * to scan and also have a limit on the number of the connections     * that will be cleanedup per run. The criteria for cleanup is the time     * for which the connection was idle. If 'force' is true then all      * connections will be looked at for the cleanup.     */    private void cleanupConnections(boolean force) {      if (force || numConnections > thresholdIdleConnections) {        long currentTime = System.currentTimeMillis();        if (!force && (currentTime - lastCleanupRunTime) < cleanupInterval) {          return;        }        int start = 0;        int end = numConnections - 1;        if (!force) {          start = rand.nextInt() % numConnections;          end = rand.nextInt() % numConnections;          int temp;          if (end < start) {            temp = start;            start = end;            end = temp;          }        }        int i = start;        int numNuked = 0;        while (i <= end) {          Connection c;          synchronized (connectionList) {            try {              c = connectionList.get(i);            } catch (Exception e) {return;}          }          if (c.timedOut(currentTime)) {            if (LOG.isDebugEnabled())              LOG.debug(getName() + ": disconnecting client " + c.getHostAddress());            closeConnection(c);            numNuked++;            end--;            c = null;            if (!force && numNuked == maxConnectionsToNuke) break;          }          else i++;        }        lastCleanupRunTime = System.currentTimeMillis();      }    }    @Override    public void run() {      LOG.info(getName() + ": starting");      SERVER.set(Server.this);      while (running) {        SelectionKey key = null;        try {          selector.select();          Iterator<SelectionKey> iter = selector.selectedKeys().iterator();          while (iter.hasNext()) {            key = iter.next();            iter.remove();            try {              if (key.isValid()) {                if (key.isAcceptable())                  doAccept(key);              }            } catch (IOException e) {            }            key = null;          }        } catch (OutOfMemoryError e) {          // we can run out of memory if we have too many threads          // log the event and sleep for a minute and give           // some thread(s) a chance to finish          LOG.warn("Out of Memory in server select", e);          closeCurrentConnection(key, e);          cleanupConnections(true);          try { Thread.sleep(60000); } catch (Exception ie) {}        } catch (Exception e) {          closeCurrentConnection(key, e);        }        cleanupConnections(false);      }      LOG.info("Stopping " + this.getName());      synchronized (this) {        try {          acceptChannel.close();          selector.close();        } catch (IOException e) { }        selector= null;        acceptChannel= null;                // clean up all connections        while (!connectionList.isEmpty()) {          closeConnection(connectionList.remove(0));        }      }    }    private void closeCurrentConnection(SelectionKey key, Throwable e) {      if (key != null) {        Connection c = (Connection)key.attachment();        if (c != null) {          if (LOG.isDebugEnabled())            LOG.debug(getName() + ": disconnecting client " + c.getHostAddress());          closeConnection(c);          c = null;        }      }    }    InetSocketAddress getAddress() {      return (InetSocketAddress)acceptChannel.socket().getLocalSocketAddress();    }        void doAccept(SelectionKey key) throws IOException,  OutOfMemoryError {      Connection c = null;      ServerSocketChannel server = (ServerSocketChannel) key.channel();      SocketChannel channel;      while ((channel = server.accept()) != null) {        channel.configureBlocking(false);        channel.socket().setTcpNoDelay(tcpNoDelay);        Reader reader = getReader();        try {          reader.startAdd();          SelectionKey readKey = reader.registerChannel(channel);          c = new Connection(readKey, channel, System.currentTimeMillis());          readKey.attach(c);          synchronized (connectionList) {            connectionList.add(numConnections, c);            numConnections++;          }          if (LOG.isDebugEnabled())            LOG.debug("Server connection from " + c.toString() +                "; # active connections: " + numConnections +                "; # queued calls: " + callQueue.size());                  } finally {          reader.finishAdd();         }      }    }    void doRead(SelectionKey key) throws InterruptedException {      int count = 0;      Connection c = (Connection)key.attachment();      if (c == null) {        return;        }      c.setLastContact(System.currentTimeMillis());            try {        count = c.readAndProcess();      } catch (InterruptedException ieo) {        LOG.info(getName() + ": readAndProcess caught InterruptedException", ieo);        throw ieo;      } catch (Exception e) {        LOG.info(getName() + ": readAndProcess threw exception " + e + ". Count of bytes read: " + count, e);        count = -1; //so that the (count < 0) block is executed      }      if (count < 0) {        if (LOG.isDebugEnabled())          LOG.debug(getName() + ": disconnecting client " +                     c + ". Number of active connections: "+                    numConnections);        closeConnection(c);        c = null;      }      else {        c.setLastContact(System.currentTimeMillis());      }    }       synchronized void doStop() {      if (selector != null) {        selector.wakeup();        Thread.yield();      }      if (acceptChannel != null) {        try {          acceptChannel.socket().close();        } catch (IOException e) {          LOG.info(getName() + ":Exception in closing listener socket. " + e);        }      }      readPool.shutdown();    }    // The method that will return the next reader to work with    // Simplistic implementation of round robin for now    Reader getReader() {      currentReader = (currentReader + 1) % readers.length;      return readers[currentReader];    }  }

（2）处理请求
该阶段主要任务是从共享队列callQueue中获取Call对象，执行对应的函数调用，并将结果返回给客户端，这全部由Handler线程完成。
Server端可同时存在多个Handler线程，它们并行从共享队列中读取Call对象，经执行对应的函数调用后，将尝试着直接将结果返回给对应的客户端（setupResponse）。但考虑到某些函数调用返回结果很大或者网络速度过慢，可能难以将结果一次性发送到客户端，此时Handler将尝试着将后续发送任务交给Responder线程(responder.doRespond(call))。

/** Handles queued calls . */  private class Handler extends Thread {    public Handler(int instanceNumber) {      this.setDaemon(true);      this.setName("IPC Server handler "+ instanceNumber + " on " + port);    }    @Override    public void run() {      LOG.info(getName() + ": starting");      SERVER.set(Server.this);      ByteArrayOutputStream buf =         new ByteArrayOutputStream(INITIAL_RESP_BUF_SIZE);      while (running) {        try {          final Call call = callQueue.take(); // pop the queue; maybe blocked here          if (LOG.isDebugEnabled())            LOG.debug(getName() + ": has #" + call.id + " from " +                      call.connection);                    String errorClass = null;          String error = null;          Writable value = null;          CurCall.set(call);          try {            // Make the call as the user via Subject.doAs, thus associating            // the call with the Subject        value = call(call.connection.protocol, call.param,                    call.timestamp);          } catch (Throwable e) {            LOG.info(getName()+", call "+call+": error: " + e, e);            errorClass = e.getClass().getName();            error = StringUtils.stringifyException(e);          }          CurCall.set(null);          synchronized (call.connection.responseQueue) {            // setupResponse() needs to be sync'ed together with             // responder.doResponse() since setupResponse may use            // SASL to encrypt response data and SASL enforces            // its own message ordering.            setupResponse(buf, call,                         (error == null) ? Status.SUCCESS : Status.ERROR,                         value, errorClass, error);          // Discard the large buf and reset it back to           // smaller size to freeup heap          if (buf.size() > maxRespSize) {            LOG.warn("Large response size " + buf.size() + " for call " +                 call.toString());              buf = new ByteArrayOutputStream(INITIAL_RESP_BUF_SIZE);            }            responder.doRespond(call);          }        } catch (InterruptedException e) {          if (running) {                          // unexpected -- log it            LOG.info(getName() + " caught: " +                     StringUtils.stringifyException(e));          }        } catch (Exception e) {          LOG.info(getName() + " caught: " +                   StringUtils.stringifyException(e));        }      }      LOG.info(getName() + ": exiting");    }  }

（3）返回结果
前面提到，每个Handler线程执行完函数调用后，会尝试着将执行结果返回给客户端，但对于特殊情况，比如函数调用返回结果过大或者网络异常情况（网速过慢），会将发送任务交给Responder线程。
Server端仅存在一个Responder线程，它的内部包含一个Selector对象，用于监听SelectionKey.OP_WRITE事件。当Handler没能将结果一次性发送到客户端时，会向该Selector对象注册SelectionKey.OP_WRITE事件，进而由Responder线程采用异步方式继续发送未发送完成的结果。

重点：Responder的processResponse方法中使用了SelectionKey.OP_WRITE，代码如下：

incPending();
try {
// Wakeup the thread blocked on select, only then can the call
// to channel.register() complete.
writeSelector.wakeup();
channel.register(writeSelector, SelectionKey.OP_WRITE, call);
} catch (ClosedChannelException e) {
//Its ok. channel might be closed else where.
done = true;
} finally {
decPending();
}

从responder可以看出，responder是以connection为单位处理的，处理完就关闭。同一个用户使用同一个协议向同一个远端发送多个call，会使用同一个连接，上一篇"RPC实现高性能NIO通信(Hadoop的rpc实现1)Server端相应输出"可以证实这一点。

0 0