redis中ziplist

来源：互联网发布：淘宝直通车推广方案编辑：程序博客网时间：2024/05/21 14:56

ziplist 是一个压缩的双向列表。传统的双向链表，在每个节点，都需要指向下一个和前一个节点的指针，占据了一定的空间；同时双向链表中使用字符串保存了节点的值，对于整形的数值而言，比较费空间。ziplist 在这些方面进行了一些优化。

下面跟着源码来学习下：

结构 <zlbytes><zltail><zllen><entry><entry><zlend>

其中 zlbytes 整个列表所占据的空间。

zltail 最后一个节点的下标，这个是便于从后往前遍历。

zllen 列表中的节点个数

entry 节点

zlend 结束标识符号

每个节点的结构如下 <pre_node_len> <node_encode><node>

其中pre_node_len表示前面一个节点占据的空间，这样可以从后面往前面遍历节点

node_encode编码以及数据信心，具体编码如下：

1, 00pppppp 前面两个bit 是00,那么表示长度小于64的字符串，后面剩下的6个bit表示字符串的长度[0-63]
2, 01pppppp|qqqqqqqq 前面两个bit是01,那么整个信息占两个字节，剩下的14个字节来表示字符串的长度[64 - 2^14-1]

通过上面的介绍，我们举个简单的例子，比如一个小于64位字符串（前面节点长度小于254），那么需要 1 ＋ 1 ＝ 2 个字节存储额外信息(非内容)

下图各种编码需要占据的空间(byte)

编码方式前面节点长度小于254大于254 1 1+ 1 = 2 1 + 5 = 62 2+ 1 = 3 2 + 5 = 73 5+ 1 =6 5+ 5 = 104 1 + 1 = 2 1 + 5 = 65 1 + 1 = 2 1 + 5=66 1 + 1 = 2 1 + 5 = 6

从上图可以看出，大部分情况，比使用链表消耗的8个byte(4个pre指针，4个next指针) 省。

但是也是需要代价的，实现复杂，特别插入和删除过程，需要内存的移动。

看下插入过程：

/* Insert item at "p". */static unsigned char *__ziplistInsert(unsigned char *zl, unsigned char *p, unsigned char *s, unsigned int slen) {    size_t curlen = ZIPLIST_BYTES(zl), reqlen, prevlen = 0;    size_t offset;    int nextdiff = 0;    unsigned char encoding = 0;    long long value;    zlentry entry, tail;    //计算插入节点前面一个节点的长度： 1 如果不是插入最后面，那么直接从原来的节点可以获取，2 如果是最后插入，那么就要计算末个节点的长度    /* Find out prevlen for the entry that is inserted. */    if (p[0] != ZIP_END) {        entry = zipEntry(p);        prevlen = entry.prevrawlen;    } else {        unsigned char *ptail = ZIPLIST_ENTRY_TAIL(zl);        if (ptail[0] != ZIP_END) {            prevlen = zipRawEntryLength(ptail);        }    }    //计算需要占据的空间     /* See if the entry can be encoded */    if (zipTryEncoding(s,slen,&value,&encoding)) {        /* 'encoding' is set to the appropriate integer encoding */        reqlen = zipIntSize(encoding);    } else {        /* 'encoding' is untouched, however zipEncodeLength will use the         * string length to figure out how to encode it. */        reqlen = slen;    }    //计算整个节点需要占据的空间(pre_len, slen, content)    /* We need space for both the length of the previous entry and     * the length of the payload. */    reqlen += zipPrevEncodeLength(NULL,prevlen);    reqlen += zipEncodeLength(NULL,encoding,slen);    //不过不是末尾插入，需要考虑下个节点纪录当前节点的长度的空间是否够    /* When the insert position is not equal to the tail, we need to     * make sure that the next entry can hold this entry's length in     * its prevlen field. */    nextdiff = (p[0] != ZIP_END) ? zipPrevLenByteDiff(p,reqlen) : 0;    /* Store offset because a realloc may change the address of zl. */    offset = p-zl;    zl = ziplistResize(zl,curlen+reqlen+nextdiff);    p = zl+offset;    /* Apply memory move when necessary and update tail offset. */    if (p[0] != ZIP_END) {        /* Subtract one because of the ZIP_END bytes */        // (p-nexdiff )  --> (p+reqlen)        memmove(p+reqlen,p-nextdiff,curlen-offset-1+nextdiff);        /* Encode this entry's raw length in the next entry. */        zipPrevEncodeLength(p+reqlen,reqlen);        //如果下个节点就是最后节点，那么结尾节点下标移动reqlen,否者把next diff 考虑进去        /* Update offset for tail */        ZIPLIST_TAIL_OFFSET(zl) += reqlen;               /* When the tail contains more than one entry, we need to take         * "nextdiff" in account as well. Otherwise, a change in the         * size of prevlen doesn't have an effect on the *tail* offset. */        tail = zipEntry(p+reqlen);        if (p[reqlen+tail.headersize+tail.len] != ZIP_END)            ZIPLIST_TAIL_OFFSET(zl) += nextdiff;    } else {        /* This element will be the new tail. */        ZIPLIST_TAIL_OFFSET(zl) = p-zl;    }    //如果下个节点长度变化，那么需要修改下下个节点对应的字段,而这个操作有可能导致下下个节点的长度发生变化,所以需要往##修改，知道某个节点长度不发生改变    /* When nextdiff != 0, the raw length of the next entry has changed, so     * we need to cascade the update throughout the ziplist */    if (nextdiff != 0) {        offset = p-zl;        zl = __ziplistCascadeUpdate(zl,p+reqlen);        p = zl+offset;    }    /* Write the entry */    //写入节点信息    p += zipPrevEncodeLength(p,prevlen);    p += zipEncodeLength(p,encoding,slen);    if (ZIP_IS_STR(encoding)) {        memcpy(p,s,slen);    } else {        zipSaveInteger(p,value,encoding);    }    ZIPLIST_INCR_LENGTH(zl,1);    return zl;}

需要注意几点： 1 因为使用连续内存，所以当在中间插入的时候，需要把后面的节点往后移动。

2 插入节点，因为下个节点需要保存当前节点的长度，因为纪录这个长度使用压缩算法，所以可能导致下个节点占据的空间发生变化，如果发生变化，那么就需要调整下个节点，这样又会导致下下个节点，所以这里需要做调整。

删除和插入过程相反，所要考虑的也就是上面两点。其中第二点单独有个函数来实现：

static unsigned char *__ziplistCascadeUpdate(unsigned char *zl, unsigned char *p) {    size_t curlen = ZIPLIST_BYTES(zl), rawlen, rawlensize;    size_t offset, noffset, extra;    unsigned char *np;    zlentry cur, next;    //如果一直有变化，那么一直到结尾    while (p[0] != ZIP_END) {        cur = zipEntry(p);        rawlen = cur.headersize + cur.len;        rawlensize = zipPrevEncodeLength(NULL,rawlen);        /* Abort if there is no next entry. */        if (p[rawlen] == ZIP_END) break;        next = zipEntry(p+rawlen);        /* Abort when "prevlen" has not changed. */        if (next.prevrawlen == rawlen) break;        //实际比原来的大        if (next.prevrawlensize < rawlensize) {            /* The "prevlen" field of "next" needs more bytes to hold             * the raw length of "cur". */            offset = p-zl;            extra = rawlensize-next.prevrawlensize;            zl = ziplistResize(zl,curlen+extra);            p = zl+offset;            /* Current pointer and offset for next element. */            np = p+rawlen;            noffset = np-zl;            /* Update tail offset when next element is not the tail element. */            if ((zl+ZIPLIST_TAIL_OFFSET(zl)) != np)                ZIPLIST_TAIL_OFFSET(zl) += extra;            /* Move the tail to the back. */            memmove(np+rawlensize,                np+next.prevrawlensize,                curlen-noffset-next.prevrawlensize-1);            zipPrevEncodeLength(np,rawlen);            /* Advance the cursor */            p += rawlen;            curlen += extra;        } else {            if (next.prevrawlensize > rawlensize) {                /* This would result in shrinking, which we want to avoid.                 * So, set "rawlen" in the available bytes. */                //通过浪费4个byte,来避免内存移动                zipPrevEncodeLengthForceLarge(p+rawlen,rawlen);            } else {                zipPrevEncodeLength(p+rawlen,rawlen);            }            /* Stop here, as the raw length of "next" has not changed. */            break;        }    }    return zl;}

整个过程就是：从当前位置往后循环，如果节点需要增长，那么就根据增加的大小，移动数据，修改下标等。如果缩小了，这里为避免移动，采用了一个技巧，就是大空间存小数据，浪费4个bye