ZMQ源码分析（六）--编码器和解码器

来源：互联网发布：青岛知行国际电话编辑：程序博客网时间：2024/06/05 06:23

zmq的编码器和解码器负责和stream_engine合作收发网络数据，zmtp3.0使用v2_decoder和v2_encoder进行收发数据，本文也只对该版本进行分析。

解码器

zmq中v1和v2解码器都继承自decoder_base_t，raw_decoder则直接继承自i_decoder：

  template <typename T> class decoder_base_t : public i_decoder    {    public:        inline decoder_base_t (size_t bufsize_) :            next (NULL),            read_pos (NULL),            to_read (0),            bufsize (bufsize_)        {            buf = (unsigned char*) malloc (bufsize_);            alloc_assert (buf);        }        //  The destructor doesn't have to be virtual. It is mad virtual        //  just to keep ICC and code checking tools from complaining.        inline virtual ~decoder_base_t ()        {            free (buf);        }        //  Returns a buffer to be filled with binary data.        inline void get_buffer (unsigned char **data_, size_t *size_)        {            //  If we are expected to read large message, we'll opt for zero-            //  copy, i.e. we'll ask caller to fill the data directly to the            //  message. Note that subsequent read(s) are non-blocking, thus            //  each single read reads at most SO_RCVBUF bytes at once not            //  depending on how large is the chunk returned from here.            //  As a consequence, large messages being received won't block            //  other engines running in the same I/O thread for excessive            //  amounts of time.            if (to_read >= bufsize) {                *data_ = read_pos;                *size_ = to_read;                return;            }            *data_ = buf;            *size_ = bufsize;        }        //  Processes the data in the buffer previously allocated using        //  get_buffer function. size_ argument specifies nemuber of bytes        //  actually filled into the buffer. Function returns 1 when the        //  whole message was decoded or 0 when more data is required.        //  On error, -1 is returned and errno set accordingly.        //  Number of bytes processed is returned in byts_used_.        inline int decode (const unsigned char *data_, size_t size_,                           size_t &bytes_used_)        {            bytes_used_ = 0;            //  In case of zero-copy simply adjust the pointers, no copying            //  is required. Also, run the state machine in case all the data            //  were processed.            if (data_ == read_pos) {                zmq_assert (size_ <= to_read);                read_pos += size_;                to_read -= size_;                bytes_used_ = size_;                while (!to_read) {                    const int rc = (static_cast <T*> (this)->*next) ();                    if (rc != 0)                        return rc;                }                return 0;            }            while (bytes_used_ < size_) {                //  Copy the data from buffer to the message.                const size_t to_copy = std::min (to_read, size_ - bytes_used_);                memcpy (read_pos, data_ + bytes_used_, to_copy);                read_pos += to_copy;                to_read -= to_copy;                bytes_used_ += to_copy;                //  Try to get more space in the message to fill in.                //  If none is available, return.                while (to_read == 0) {                    const int rc = (static_cast <T*> (this)->*next) ();                    if (rc != 0)                        return rc;                }            }            return 0;        }    protected:        //  Prototype of state machine action. Action should return false if        //  it is unable to push the data to the system.        typedef int (T::*step_t) ();        //  This function should be called from derived class to read data        //  from the buffer and schedule next state machine action.        inline void next_step (void *read_pos_, size_t to_read_, step_t next_)        {            read_pos = (unsigned char*) read_pos_;            to_read = to_read_;            next = next_;        }    private:        //  Next step. If set to NULL, it means that associated data stream        //  is dead. Note that there can be still data in the process in such        //  case.        step_t next;        //  Where to store the read data.        unsigned char *read_pos;        //  How much data to read before taking next step.        size_t to_read;        //  The duffer for data to decode.        size_t bufsize;        unsigned char *buf;        decoder_base_t (const decoder_base_t&);        const decoder_base_t &operator = (const decoder_base_t&);    };

解码器的next函数指针同样是一个状态机，每次调用状态机都会重置read_pos和to_read两个变量，表示下一步需要把数据读到什么位置以及需要读取的数据的大小。get_buffer方法主要是返回一个可以读取数据的缓存以及该缓存的大小。如果是小数据，则先使用解码器自带的缓存buf，该缓存的大小为bufsize。如果是大数据，则直接向next返回的read_pos中读取数据，这样可以避免一次数据拷贝。decode同样分为两种情况，如果是之前没有使用自带缓存，则直接移动指针即可。如果是小数据，则需要把数据从缓存中考入到read_pos位置。如果to_read为0，说明当前状态下的所有数据已经处理完毕，需要移动到下一个状态，调用next重置read_pos和to_read。
下面看一下v2_decoder_t的实现：

    //  Decoder for ZMTP/2.x framing protocol. Converts data stream into messages.    class v2_decoder_t : public decoder_base_t <v2_decoder_t>    {    public:        v2_decoder_t (size_t bufsize_, int64_t maxmsgsize_);        virtual ~v2_decoder_t ();        //  i_decoder interface.        virtual msg_t *msg () { return &in_progress; }    private:        int flags_ready ();        int one_byte_size_ready ();        int eight_byte_size_ready ();        int message_ready ();        unsigned char tmpbuf [8];        unsigned char msg_flags;        msg_t in_progress;        const int64_t maxmsgsize;        v2_decoder_t (const v2_decoder_t&);        void operator = (const v2_decoder_t&);    };

v2_decoder_t有四个状态机方法分别对应四种状态，同时有一个8字节的缓存，in_progress是解码器正在处理的消息。解码器解析出来的msg都保存在这里。maxmsgsize是一个最大消息长度的阀值。下面看着四种状态的转换关系：

zmq::v2_decoder_t::v2_decoder_t (size_t bufsize_, int64_t maxmsgsize_) :    decoder_base_t <v2_decoder_t> (bufsize_),    msg_flags (0),    maxmsgsize (maxmsgsize_){    int rc = in_progress.init ();    errno_assert (rc == 0);    //  At the beginning, read one byte and go to flags_ready state.    next_step (tmpbuf, 1, &v2_decoder_t::flags_ready);}zmq::v2_decoder_t::~v2_decoder_t (){    int rc = in_progress.close ();    errno_assert (rc == 0);}int zmq::v2_decoder_t::flags_ready (){    msg_flags = 0;    if (tmpbuf [0] & v2_protocol_t::more_flag)        msg_flags |= msg_t::more;    if (tmpbuf [0] & v2_protocol_t::command_flag)        msg_flags |= msg_t::command;    //  The payload length is either one or eight bytes,    //  depending on whether the 'large' bit is set.    if (tmpbuf [0] & v2_protocol_t::large_flag)        next_step (tmpbuf, 8, &v2_decoder_t::eight_byte_size_ready);    else        next_step (tmpbuf, 1, &v2_decoder_t::one_byte_size_ready);    return 0;}int zmq::v2_decoder_t::one_byte_size_ready (){    //  Message size must not exceed the maximum allowed size.    if (maxmsgsize >= 0)        if (unlikely (tmpbuf [0] > static_cast <uint64_t> (maxmsgsize))) {            errno = EMSGSIZE;            return -1;        }    //  in_progress is initialised at this point so in theory we should    //  close it before calling zmq_msg_init_size, however, it's a 0-byte    //  message and thus we can treat it as uninitialised...    int rc = in_progress.init_size (tmpbuf [0]);    if (unlikely (rc)) {        errno_assert (errno == ENOMEM);        rc = in_progress.init ();        errno_assert (rc == 0);        errno = ENOMEM;        return -1;    }    in_progress.set_flags (msg_flags);    next_step (in_progress.data (), in_progress.size (),        &v2_decoder_t::message_ready);    return 0;}int zmq::v2_decoder_t::eight_byte_size_ready (){    //  The payload size is encoded as 64-bit unsigned integer.    //  The most significant byte comes first.    const uint64_t msg_size = get_uint64 (tmpbuf);    //  Message size must not exceed the maximum allowed size.    if (maxmsgsize >= 0)        if (unlikely (msg_size > static_cast <uint64_t> (maxmsgsize))) {            errno = EMSGSIZE;            return -1;        }    //  Message size must fit into size_t data type.    if (unlikely (msg_size != static_cast <size_t> (msg_size))) {        errno = EMSGSIZE;        return -1;    }    //  in_progress is initialised at this point so in theory we should    //  close it before calling init_size, however, it's a 0-byte    //  message and thus we can treat it as uninitialised.    int rc = in_progress.init_size (static_cast <size_t> (msg_size));    if (unlikely (rc)) {        errno_assert (errno == ENOMEM);        rc = in_progress.init ();        errno_assert (rc == 0);        errno = ENOMEM;        return -1;    }    in_progress.set_flags (msg_flags);    next_step (in_progress.data (), in_progress.size (),        &v2_decoder_t::message_ready);    return 0;}int zmq::v2_decoder_t::message_ready (){    //  Message is completely read. Signal this to the caller    //  and prepare to decode next message.    next_step (tmpbuf, 1, &v2_decoder_t::flags_ready);    return 1;}

在构造函数中调用

next_step (tmpbuf, 1, &v2_decoder_t::flags_ready)

代表接下来想tmpbuf中读入一个字节的数据，下一个状态机状态是flags_ready方法。flags_ready中会分析这条数据是否为长消息，如果是说明接下来的八个字节是消息长度，如果不是说明截下来一个字节是消息长度。这是zmtp规定的数据格式。

    if (tmpbuf [0] & v2_protocol_t::large_flag)        next_step (tmpbuf, 8, &v2_decoder_t::eight_byte_size_ready);    else        next_step (tmpbuf, 1, &v2_decoder_t::one_byte_size_ready);

以长消息为例，截下来向tmpbuf中读入8字节长度数据，读取之后进入到eight_byte_size_ready状态。eight_byte_size_ready中已经知道了消息的长度，则用该长度初始化in_progress的大小，下一个状态是

next_step (in_progress.data (), in_progress.size (),&v2_decoder_t::message_ready)

代表向in_progress读入之前得到的数据长度，下一个状态设置成message_ready。当调用message_ready时候说明一条完整的msg已经处理完成了。message_ready方法把状态及设置成初始状态来读取下一条msg。message_ready返回1表明一条完整数据已经读取，其他状态都返回0。
v2_decoder_t主要用于stream_engine中的in_event方法中

void zmq::stream_engine_t::in_event (){    zmq_assert (!io_error);    //  If still handshaking, receive and process the greeting message.    if (unlikely (handshaking))        if (!handshake ())            return;    zmq_assert (decoder);    //  If there has been an I/O error, stop polling.    if (input_stopped) {        rm_fd (handle);        io_error = true;        return;    }    //  If there's no data to process in the buffer...    if (!insize) {        //  Retrieve the buffer and read as much data as possible.        //  Note that buffer can be arbitrarily large. However, we assume        //  the underlying TCP layer has fixed buffer size and thus the        //  number of bytes read will be always limited.        size_t bufsize = 0;        decoder->get_buffer (&inpos, &bufsize);        const int rc = tcp_read (s, inpos, bufsize);        if (rc == 0) {            error (connection_error);            return;        }        if (rc == -1) {            if (errno != EAGAIN)                error (connection_error);            return;        }        //  Adjust input size        insize = static_cast <size_t> (rc);    }    int rc = 0;    size_t processed = 0;    while (insize > 0) {        rc = decoder->decode (inpos, insize, processed);        zmq_assert (processed <= insize);        inpos += processed;        insize -= processed;        if (rc == 0 || rc == -1)            break;        rc = (this->*process_msg) (decoder->msg ());        if (rc == -1)            break;    }    //  Tear down the connection if we have failed to decode input data    //  or the session has rejected the message.    if (rc == -1) {        if (errno != EAGAIN) {            error (protocol_error);            return;        }        input_stopped = true;        reset_pollin (handle);    }    session->flush ();}

如果insize是0，则调用get_buffer，把inpos指向v2_decoder_t的缓存或者是直接指向v2_decoder_t中的in_progress（数据长度大于v2_decoder_t的缓存长度，默认是8192），然后调用tcp_read读入数据。while循环处理当前的读入的数据，如果独到一条完整的消息，则交给process_msg处理，如果剩下的数据不足一条msg，则跳出循环，等待下一次in_event的调用。出错的话则停止监听数据。

编码器

zmq中v1和v2编码器都继承自encoder_base_t，raw_encoder则直接继承自i_encoder：

    template <typename T> class encoder_base_t : public i_encoder    {    public:        inline encoder_base_t (size_t bufsize_) :            bufsize (bufsize_),            in_progress (NULL)        {            buf = (unsigned char*) malloc (bufsize_);            alloc_assert (buf);        }        //  The destructor doesn't have to be virtual. It is made virtual        //  just to keep ICC and code checking tools from complaining.        inline virtual ~encoder_base_t ()        {            free (buf);        }        //  The function returns a batch of binary data. The data        //  are filled to a supplied buffer. If no buffer is supplied (data_        //  points to NULL) decoder object will provide buffer of its own.        inline size_t encode (unsigned char **data_, size_t size_)        {            unsigned char *buffer = !*data_ ? buf : *data_;            size_t buffersize = !*data_ ? bufsize : size_;            if (in_progress == NULL)                return 0;            size_t pos = 0;            while (pos < buffersize) {                //  If there are no more data to return, run the state machine.                //  If there are still no data, return what we already have                //  in the buffer.                if (!to_write) {                    if (new_msg_flag) {                        int rc = in_progress->close ();                        errno_assert (rc == 0);                        rc = in_progress->init ();                        errno_assert (rc == 0);                        in_progress = NULL;                        break;                    }                    (static_cast <T*> (this)->*next) ();                }                //  If there are no data in the buffer yet and we are able to                //  fill whole buffer in a single go, let's use zero-copy.                //  There's no disadvantage to it as we cannot stuck multiple                //  messages into the buffer anyway. Note that subsequent                //  write(s) are non-blocking, thus each single write writes                //  at most SO_SNDBUF bytes at once not depending on how large                //  is the chunk returned from here.                //  As a consequence, large messages being sent won't block                //  other engines running in the same I/O thread for excessive                //  amounts of time.                if (!pos && !*data_ && to_write >= buffersize) {                    *data_ = write_pos;                    pos = to_write;                    write_pos = NULL;                    to_write = 0;                    return pos;                }                //  Copy data to the buffer. If the buffer is full, return.                size_t to_copy = std::min (to_write, buffersize - pos);                memcpy (buffer + pos, write_pos, to_copy);                pos += to_copy;                write_pos += to_copy;                to_write -= to_copy;            }            *data_ = buffer;            return pos;        }        void load_msg (msg_t *msg_)        {            zmq_assert (in_progress == NULL);            in_progress = msg_;            (static_cast <T*> (this)->*next) ();        }    protected:        //  Prototype of state machine action.        typedef void (T::*step_t) ();        //  This function should be called from derived class to write the data        //  to the buffer and schedule next state machine action.        inline void next_step (void *write_pos_, size_t to_write_,            step_t next_, bool new_msg_flag_)        {            write_pos = (unsigned char*) write_pos_;            to_write = to_write_;            next = next_;            new_msg_flag = new_msg_flag_;        }    private:        //  Where to get the data to write from.        unsigned char *write_pos;        //  How much data to write before next step should be executed.        size_t to_write;        //  Next step. If set to NULL, it means that associated data stream        //  is dead.        step_t next;        bool new_msg_flag;        //  The buffer for encoded data.        size_t bufsize;        unsigned char *buf;        encoder_base_t (const encoder_base_t&);        void operator = (const encoder_base_t&);    protected:        msg_t *in_progress;

encoder_base_t比decoder_base_t逻辑稍微复杂一些，但也是使用状态机实现的。encoder_base_t最重要的是encode方法，在分析encode方法之前，先看一下encoder_base_t的使用方式，它主要使用在stream_engine的out_event中：

void zmq::stream_engine_t::out_event (){    zmq_assert (!io_error);    //  If write buffer is empty, try to read new data from the encoder.    if (!outsize) {        //  Even when we stop polling as soon as there is no        //  data to send, the poller may invoke out_event one        //  more time due to 'speculative write' optimisation.        if (unlikely (encoder == NULL)) {            zmq_assert (handshaking);            return;        }        outpos = NULL;        outsize = encoder->encode (&outpos, 0);        while (outsize < out_batch_size) {            if ((this->*next_msg) (&tx_msg) == -1)                break;            encoder->load_msg (&tx_msg);            unsigned char *bufptr = outpos + outsize;            size_t n = encoder->encode (&bufptr, out_batch_size - outsize);            zmq_assert (n > 0);            if (outpos == NULL)                outpos = bufptr;            outsize += n;        }        //  If there is no data to send, stop polling for output.        if (outsize == 0) {            output_stopped = true;            reset_pollout (handle);            return;        }    }    //  If there are any data to write in write buffer, write as much as    //  possible to the socket. Note that amount of data to write can be    //  arbitrarily large. However, we assume that underlying TCP layer has    //  limited transmission buffer and thus the actual number of bytes    //  written should be reasonably modest.    const int nbytes = tcp_write (s, outpos, outsize);    //  IO error has occurred. We stop waiting for output events.    //  The engine is not terminated until we detect input error;    //  this is necessary to prevent losing incoming messages.    if (nbytes == -1) {        reset_pollout (handle);        return;    }    outpos += nbytes;    outsize -= nbytes;    //  If we are still handshaking and there are no data    //  to send, stop polling for output.    if (unlikely (handshaking))        if (outsize == 0)            reset_pollout (handle);}

每次调用该方法会先判断outsize是否为0，如果是0，说明之前的数据已经全部发送出去。if语句中首先调用

        outpos = NULL;        outsize = encoder->encode (&outpos, 0);

将oupos指向encoder的缓存，然后不断从next_msg中读出需要发送的msg，之后调用encoder的load_msg将新的msg存入到encoder中，最后调用

 size_t n = encoder->encode (&bufptr, out_batch_size - outsize);

将刚刚存入的msg写入缓存，encode不一定处理整条消息，如果空间不够可以处理部分消息。如果缓存已满或者没有新的msg可以写则调用tcp_write。out_event的设计可以使一次tcp_write发送多条msg，减少系统调用，提高效率。如果msg没有处理完整，则下次再次进入到if语句中时

outsize = encoder->encode (&outpos, 0);

会继续编码剩下的数据。

看完stream_engine是怎么样使用encoder之后，再回头看encoder的encode方法，该方法每次把buff指向自己的缓存或者是传入进来的指针，接着encoder向buff中写入数据，首先判断to_write是否为0，如果是则运行状态机，这里同样有一个避免拷贝的优化，当to_read比自带buffer大并且传入进来的＊data是null，当前的pos也为0（证明之前的数据已经全部发送出去，不会造成数据混乱），则可以直接将发送缓存的指针指向msg的数据部分，这里也不会存在线程安全问题。
v2_encoder的状态机和v2_decoder相比比较简单，只有两个状态：

zmq::v2_encoder_t::v2_encoder_t (size_t bufsize_) :    encoder_base_t <v2_encoder_t> (bufsize_){    //  Write 0 bytes to the batch and go to message_ready state.    next_step (NULL, 0, &v2_encoder_t::message_ready, true);}zmq::v2_encoder_t::~v2_encoder_t (){}void zmq::v2_encoder_t::message_ready (){    //  Encode flags.    unsigned char &protocol_flags = tmpbuf [0];    protocol_flags = 0;    if (in_progress->flags () & msg_t::more)        protocol_flags |= v2_protocol_t::more_flag;    if (in_progress->size () > 255)        protocol_flags |= v2_protocol_t::large_flag;    if (in_progress->flags () & msg_t::command)        protocol_flags |= v2_protocol_t::command_flag;    //  Encode the message length. For messages less then 256 bytes,    //  the length is encoded as 8-bit unsigned integer. For larger    //  messages, 64-bit unsigned integer in network byte order is used.    const size_t size = in_progress->size ();    if (unlikely (size > 255)) {        put_uint64 (tmpbuf + 1, size);        next_step (tmpbuf, 9, &v2_encoder_t::size_ready, false);    }    else {        tmpbuf [1] = static_cast <uint8_t> (size);        next_step (tmpbuf, 2, &v2_encoder_t::size_ready, false);    }}void zmq::v2_encoder_t::size_ready (){    //  Write message body into the buffer.    next_step (in_progress->data (), in_progress->size (),        &v2_encoder_t::message_ready, true);}

以上就是v2编码器和解码器的工作原理。
除了v1和v2编码器，zmq还提供raw_decode/encode 方式，这种方式比较简单，这里就不做分析了。

1 0