ZooKeeper服务端线程模型分析

来源:互联网 发布:星际争霸剧情知乎 编辑:程序博客网 时间:2024/06/14 08:59

ZooKeeper服务端线程模型是怎么样的呢,这几天研究了一下源码,总结并分析一下ZooKeeper服务端的线程模型。

一、线程模型

图一

服务端使用的线程模型如图一。我们可以看见在服务端有一个线程(AccpectThread),该线程负责处理客户端建立连接建立,并在连接建立后将该连接放入到一个队列(AcceptedQueue)。紧接着线程(SelectThread)负责从队列取出连接进行处理,只要IO操作准备就绪(可读、可写),该连接会被扔到工作线程中进行处理。
PS:一个SelectThread对应一个连接建立队列。在AccpectThread中处理连接时使用轮询算法(Round-Robin)将每个连接分配到对应的SelectThread的AcceptedQueue。

二、类关系图


图二

经过查看源码,整理出类图见图二。

三、源码分析

1)启动过程

启动主函数类:org.apache.zookeeper.server.ZooKeeperServerMain

    public void runFromConfig(ServerConfig config)            throws IOException, AdminServerException {        LOG.info("Starting server");        FileTxnSnapLog txnLog = null;        try {            // Note that this thread isn't going to be doing anything else,            // so rather than spawning another thread, we will just call            // run() in this thread.            // create a file logger url from the command line args            txnLog = new FileTxnSnapLog(config.dataLogDir, config.dataDir);            final ZooKeeperServer zkServer = new ZooKeeperServer(txnLog,                    config.tickTime, config.minSessionTimeout, config.maxSessionTimeout, null);            // Registers shutdown handler which will be used to know the            // server error or shutdown state changes.            final CountDownLatch shutdownLatch = new CountDownLatch(1);            zkServer.registerServerShutdownHandler(                    new ZooKeeperServerShutdownHandler(shutdownLatch));            ........            boolean needStartZKServer = true;            if (config.getClientPortAddress() != null) {                cnxnFactory = ServerCnxnFactory.createFactory();                cnxnFactory.configure(config.getClientPortAddress(), config.getMaxClientCnxns(), false);                cnxnFactory.startup(zkServer);                // zkServer has been started. So we don't need to start it again in secureCnxnFactory.                needStartZKServer = false;            }         ........            // Watch status of ZooKeeper server. It will do a graceful shutdown            // if the server is not running or hits an internal error.            shutdownLatch.await();            shutdown();            if (cnxnFactory != null) {                cnxnFactory.join();            }            ......            if (zkServer.canShutdown()) {                zkServer.shutdown(true);            }        } catch (InterruptedException e) {            // warn, but generally this is ok            LOG.warn("Server interrupted", e);        } finally {            if (txnLog != null) {                txnLog.close();            }        }}
类:org.apache.zookeeper.server.ServerCnxnFactory
    public void startup(ZooKeeperServer zkServer) throws IOException, InterruptedException {        startup(zkServer, true);    }
类:org.apache.zookeeper.server.NIOServerCnxnFactory
    public void configure(InetSocketAddress addr, int maxcc, boolean secure) throws IOException {        ......        maxClientCnxns = maxcc;        sessionlessCnxnTimeout = Integer.getInteger(            ZOOKEEPER_NIO_SESSIONLESS_CNXN_TIMEOUT, 10000);        // We also use the sessionlessCnxnTimeout as expiring interval for        // cnxnExpiryQueue. These don't need to be the same, but the expiring        // interval passed into the ExpiryQueue() constructor below should be        // less than or equal to the timeout.        cnxnExpiryQueue =            new ExpiryQueue<NIOServerCnxn>(sessionlessCnxnTimeout);        expirerThread = new ConnectionExpirerThread();        int numCores = Runtime.getRuntime().availableProcessors();        // 32 cores sweet spot seems to be 4 selector threads        numSelectorThreads = Integer.getInteger(            ZOOKEEPER_NIO_NUM_SELECTOR_THREADS,            Math.max((int) Math.sqrt((float) numCores/2), 1));        if (numSelectorThreads < 1) {            throw new IOException("numSelectorThreads must be at least 1");        }        numWorkerThreads = Integer.getInteger(            ZOOKEEPER_NIO_NUM_WORKER_THREADS, 2 * numCores);        workerShutdownTimeoutMS = Long.getLong(            ZOOKEEPER_NIO_SHUTDOWN_TIMEOUT, 5000);        LOG.info("Configuring NIO connection handler with "                 + (sessionlessCnxnTimeout/1000) + "s sessionless connection"                 + " timeout, " + numSelectorThreads + " selector thread(s), "                 + (numWorkerThreads > 0 ? numWorkerThreads : "no")                 + " worker threads, and "                 + (directBufferBytes == 0 ? "gathered writes." :                    ("" + (directBufferBytes/1024) + " kB direct buffers.")));        for(int i=0; i<numSelectorThreads; ++i) {            selectorThreads.add(new SelectorThread(i));        }        this.ss = ServerSocketChannel.open();         ss.socket().setReuseAddress(true);        LOG.info("binding to port " + addr);        ss.socket().bind(addr);        ss.configureBlocking(false);        acceptThread = new AcceptThread(ss, addr, selectorThreads);    }    public void startup(ZooKeeperServer zks, boolean startServer) throws IOException, InterruptedException {        start();        setZooKeeperServer(zks);        if (startServer) {            zks.startdata();            zks.startup();        }    }    public void start() {        stopped = false;        if (workerPool == null) {            workerPool = new WorkerService(                "NIOWorker", numWorkerThreads, false);        }        for(SelectorThread thread : selectorThreads) {            if (thread.getState() == Thread.State.NEW) {                thread.start();            }        }        // ensure thread is started once and only once        if (acceptThread.getState() == Thread.State.NEW) {            acceptThread.start();        }        if (expirerThread.getState() == Thread.State.NEW) {            expirerThread.start();        }    }
根据源码,服务端的启动过程主要由以下几部组成:
①读取服务配置信息,该文件就是常用的zoo.cfg。
②构建ZooKeeperServer实例
③构建NIOServerCnxnFactory实例,并先后调用该实例的configure与startup方法。
④在NIOServerCnxnFactory. configure方法中设置服务端的各项参数,其中主要包括最大客户端连接数、客户端会话超时时间(默认值10000ms),多路复用线程数,工作线程数。并在该函数中绑定监听端口,构建连接处理线程。
⑤在NIOServerCnxnFactory. startup方法中,最终会调用到org.apache.zookeeper.server.NIOServerCnxnFactory.start方法,在该方法会构建出对应数量的工作线程,然后启动对应数量的多路复用线程和一个连接建立处理线程。
PS:多路复用线程数设置的算法为numSelectorThreads = Integer.getInteger(ZOOKEEPER_NIO_NUM_SELECTOR_THREADS,  Math.max((int) Math.sqrt((float) numCores/2), 1)))。如果系统参数设置了zookeeper.nio.numSelectorThreads则使用该参数,如果没有设置则使用机器内核数开平方的值除2,最小参数值为1。

2)连接处理过程

类:org.apache.zookeeper.server.NIOServerCnxnFactory.AcceptThread

      public AcceptThread(ServerSocketChannel ss, InetSocketAddress addr, Set<SelectorThread> selectorThreads) throws IOException {            super("NIOServerCxnFactory.AcceptThread:" + addr);            this.acceptSocket = ss;            this.acceptKey =                acceptSocket.register(selector, SelectionKey.OP_ACCEPT);            this.selectorThreads = Collections.unmodifiableList(                new ArrayList<SelectorThread>(selectorThreads));            selectorIterator = this.selectorThreads.iterator();        }        public void run() {            try {                while (!stopped && !acceptSocket.socket().isClosed()) {                    try {                        select();                    } catch (RuntimeException e) {                        LOG.warn("Ignoring unexpected runtime exception", e);                    } catch (Exception e) {                        LOG.warn("Ignoring unexpected exception", e);                    }                }            } finally {                closeSelector();                // This will wake up the selector threads, and tell the                // worker thread pool to begin shutdown.            if (!reconfiguring) {                                        NIOServerCnxnFactory.this.stop();                }                LOG.info("accept thread exitted run method");            }        }        private void select() {            try {                selector.select();                Iterator<SelectionKey> selectedKeys =                    selector.selectedKeys().iterator();                while (!stopped && selectedKeys.hasNext()) {                    SelectionKey key = selectedKeys.next();                    selectedKeys.remove();                    if (!key.isValid()) {                        continue;                    }                    if (key.isAcceptable()) {                        if (!doAccept()) { //处理连接请求                            // If unable to pull a new connection off the accept                            // queue, pause accepting to give us time to free                            // up file descriptors and so the accept thread                            // doesn't spin in a tight loop.                            pauseAccept(10);                        }                    } else {                        LOG.warn("Unexpected ops in accept select "                                 + key.readyOps());                    }                }            } catch (IOException e) {                LOG.warn("Ignoring IOException while selecting", e);            }        }        private boolean doAccept() {            boolean accepted = false;            SocketChannel sc = null;            try {                sc = acceptSocket.accept();                accepted = true;                InetAddress ia = sc.socket().getInetAddress();                int cnxncount = getClientCnxnCount(ia);                if (maxClientCnxns > 0 && cnxncount >= maxClientCnxns){                    throw new IOException("Too many connections from " + ia                                          + " - max is " + maxClientCnxns );                } //当maxClientCnxns设置为0时,跳过了此项检验                LOG.info("Accepted socket connection from "                         + sc.socket().getRemoteSocketAddress());                sc.configureBlocking(false);                // Round-robin assign this connection to a selector thread                if (!selectorIterator.hasNext()) {                    selectorIterator = selectorThreads.iterator();                }                SelectorThread selectorThread = selectorIterator.next();                if (!selectorThread.addAcceptedConnection(sc)) { //派发新的连接到AcceptedThread                    throw new IOException(                        "Unable to add connection to selector queue"                        + (stopped ? " (shutdown in progress)" : ""));                }                acceptErrorLogger.flush();            } catch (IOException e) {                // accept, maxClientCnxns, configureBlocking                acceptErrorLogger.rateLimitLog(                    "Error accepting new connection: " + e.getMessage());                fastCloseSock(sc);            }            return accepted;        }    }
类:org.apache.zookeeper.server.NIOServerCnxnFactory.SelectorThread

        public void run() {            try {                while (!stopped) {                    try {                        select();                        processAcceptedConnections();                        processInterestOpsUpdateRequests();                    } catch (RuntimeException e) {                        LOG.warn("Ignoring unexpected runtime exception", e);                    } catch (Exception e) {                        LOG.warn("Ignoring unexpected exception", e);                    }                }                ......            } finally {                closeSelector();                // This will wake up the accept thread and the other selector                // threads, and tell the worker thread pool to begin shutdown.                NIOServerCnxnFactory.this.stop();                LOG.info("selector thread exitted run method");            }        }        private void select() {            try {                selector.select();                Set<SelectionKey> selected = selector.selectedKeys();                ArrayList<SelectionKey> selectedList =                    new ArrayList<SelectionKey>(selected);                Collections.shuffle(selectedList);                Iterator<SelectionKey> selectedKeys = selectedList.iterator();                while(!stopped && selectedKeys.hasNext()) {                    SelectionKey key = selectedKeys.next();                    selected.remove(key);                    if (!key.isValid()) {                        cleanupSelectionKey(key);                        continue;                    }                    if (key.isReadable() || key.isWritable()) {                        handleIO(key); //处理IO操作                    } else {                        LOG.warn("Unexpected ops in select " + key.readyOps());                    }                }            } catch (IOException e) {                LOG.warn("Ignoring IOException while selecting", e);            }        }        private void handleIO(SelectionKey key) {            IOWorkRequest workRequest = new IOWorkRequest(this, key);            NIOServerCnxn cnxn = (NIOServerCnxn) key.attachment();            // Stop selecting this key while processing on its            // connection            cnxn.disableSelectable();            key.interestOps(0);            touchCnxn(cnxn);            workerPool.schedule(workRequest); //使用工作线程处理请求        }        private void processAcceptedConnections() {            SocketChannel accepted;            while (!stopped && (accepted = acceptedQueue.poll()) != null) {                SelectionKey key = null;                try {                    key = accepted.register(selector, SelectionKey.OP_READ);                    NIOServerCnxn cnxn = createConnection(accepted, key, this);                    key.attach(cnxn);                    addCnxn(cnxn);                } catch (IOException e) {                    // register, createConnection                    cleanupSelectionKey(key);                    fastCloseSock(accepted);                }            }        }
根据源码分析,可以总结出连接处理过程主要分为以下几步:
①构建AcceptThread实例时,在构造函数中可以看到,该线程设置了SelectorThread列表以及注册ServerSocketChannel的监听事件到多路复用器(Selector)上。
②在AcceptThread运行过程中,调用AcceptThread.select方法,在该方法中只要有准备建立的连接就会调用AcceptThread.doAccept方法,在doAccept方法中会校验每个IP地址的最大连接数,然后在使用轮询调度算法将新建立的连接放到SelectorThread的acceptedQueue列表中,再通知SelectorThread进行调度。
③SelectorThread. processAcceptedConnections方法中,可以看到该方法从acceptedQueue中取出建立的连接,并将该连接注册到SelectorThread的多路复用器上,并监听读事件。
④SelectorThread的多路复用器继续运行,当注册到该多路复用器的连接可读或者可写时(见SelectorThread.select方法),会调用SelectorThread.handleIO方法,在handleIO方法中会将该连接派发到工作线程中进行处理(调用workerPool.schedule方法)。

ZooKeeper服务端线程模型到此也结束了,从上面的分析过程中可以看到,该模型是一个很标准的Reactor模式,详细的Reactor模型介绍可以参考以下链接进行了解。
Reactor模式介绍:http://blog.csdn.net/zccracker/article/details/38686339






原创粉丝点击