深入理解linux内核list_head的实现

来源：互联网发布：h3c cas云计算管理平台编辑：程序博客网时间：2024/05/21 22:39

转自：http://www.54xue.com/w/04/n-29304.html

注：文章很长，比喻恰当比较好理解。感谢原创作者。---》蓝色为本人增加。

前言：在linux源代码中有个头文件为list.h.很多linux下的源代码都会使用这个头文件，它里面定义了一个结构，以及定义了和其相关的一组函数，这个结构是这样的：

    struct list_head{

    struct list_head *next, *prev;

    };

    那么这个头文件又是有什么样的作用呢，这篇文章就是用来解释它的作用，虽然这是linux下的源代码，但对于学习C语言的人来说，这是算法和平台没有什么关系。
    采用这种定义有以下两种好处：1是可扩展性，2是封装。内核一直都是在发展中的，所以代码都不能写成死代码，要方便修改和追加。将链表常见的操作都进行封装，使用者只关注接口，不需关注实现。
    一、双向链表

    学习计算机的人都会开一门课程《数据结构》，里面都会有讲解双向链表的内容。

    什么是双向链表，它看起来是这样的：

    struct dlist

    {

    int no;

    void* data;

    struct dlist *prev, *next;

    };

他的图形结构图：

现在有几个结构体，它们是：

表示人的：

    struct person

    {

    int age;

    int weight;

    };

    表示动物的：

    struct animal

    {

    int age;

    int weight;

    };

    如果有一组filename变量和filedata变量，把它们存起来，我们会怎么做，当然就用数组了，但我们想使用双向链表，让它们链接起来，那该怎么做，唯一可以做的就是给每个结构加如两个成员，如下：

    表示人的：

    struct person

    {

    int age;

    int weight;

    struct person *next, *prev;

    };

    表示动物的：

    struct animal

    {

    int age;

    int weight;

    struct animal *next, *prev;

    };

    现在有一个人的一个链表的链头指针person_head （循环双向链表）和动物的链表的链头指针inimal_head ,我们要获得特定年龄和特定体重的人或动物（假设不考虑重叠），那么代码看起来可能是这样：

    struct person * get_percent（int age, int weight）

    {

    ……

    struct person *p;

    for（p = person_head->next; p != person_head; p=p->next）

    {

    if（p->age == age && p->weight == weight）

    return p;

    }

    ……

    }

    那同理，要获得一个特定年龄和重量的动物的函数get_animal（int age, int weight）的代码也是和上面的类似。我们再回过头来看这两个结构，它们的指向前和指向后的指针其实都差不多，那把它们综合起来吧，所以看起来如下面：

    struct list_head{

    struct list_head *next, *prev;

    };

    表示人的：

    struct person{

    int age;

    int weight;

    struct list_head list;

    };

    动物的：

    struct animal

    {

    int age;

    int weight;

    struct list_head list;

    };

    可能又会有些人会问了，struct list_head都不是struct persion和struct animal类型，怎么可以做链表的指针呢？其实，无论是什么样的指针，它的大小都是一样的，32位的系统中，指针的大小都是32位（即4个字节），只是不同类型的指针在解释的时候不一样而已，那么这个struct list_head又是怎么去做这些结构的链表指针呢，那么就请看下一节吧：）。
      二、struct list_head结构的操作

    首先，让我们来看下和struct list_head有关的两个宏，它们定义在list.h文件中。

    #define LIST_HEAD_INIT（name） { &（name）， &（name） }

    #define LIST_HEAD（name） struct list_head name = LIST_HEAD_INIT（name）

    #define INIT_LIST_HEAD（ptr） do { \

    （ptr）->next = （ptr）；（ptr）->prev = （ptr）； \

    } while （0）

    这两个宏是用了定义双向链表的头节点的，定义一个双向链表的头节点，我们可以这样：

    struct list_head head;

    LIST_HEAD_INIT（head）；

    又或者直接这样：

    LIST_HEAD（head）；

    这样，我们就定义并初始化了一个头节点。

    #define LIST_HEAD_INIT（name

    { &（name），

    &（name）

    }

    就是用head的地址初始化其两个成员next和prev ,使其都指向自己。我们再看下和其相关的几个函数，这些函数都作为内联函数也都定义list.h中，这里要说明一下linux源的一个风格，在下面的这些函数中以下划线开始的函数是给内部调用的函数，而以符开始的函数就是对外使用的函数，这些函数一般都是调用以下划线开始的函数，或是说是对下划线开始的函数的封装。

    2.1 增加节点的函数

    static inline void __list_add（）；

    static inline void list_add（）；

    static inline void list_add_tail（）；

    其实看源代码是最好的讲解了，这里我再简单的讲一下。

    /**

    * __list_add - Insert a new entry between two known consecutive entries.

    * @new:

    * @prev:

    * @next:

    *

    * This is only for internal list manipulation where we know the prev/next

    * entries

    */

    static __inline__ void __list_add（struct list_head * new,

    struct list_head * prev, struct list_head * next）

    {

    next->prev = new;

    new->next = next;

    new->prev = prev;

    prev->next = new;

    }

    //这个函数在prev和next间插入一个节点new.

    /**

    * list_add - add a new entry

    * @new: new entry to be added

    * @head: list head to add it after

    *

    * Insert a new entry after the specified head.

    * This is good for implementing stacks.

    */

    static __inline__ void list_add（struct list_head *new, struct list_head *head）

    {

    __list_add（new, head, head->next）；

    }

    //这个函数在head节点后面插入new节点。

    /**

    * list_add_tail - add a new entry

    * @new: new entry to be added

    * @head: list head to add it before

    *

    * Insert a new entry before the specified head.

    * This is useful for implementing queues.

    */

    static __inline__ void list_add_tail（struct list_head *new, struct list_head *head）

    {

    __list_add（new, head->prev, head）；

    }

    这个函数和上面的那个函数相反，它在head节点的前面插入new节点。

list_add()函数中的__list_add(new, head, head->next)表示的在head和head->next之间加入一个新的节点。即表头插入法（即先插入的后输出，可以用来实现一个栈）。此处是表头后的第一个位置。
list_add_tail()中的__list_add(new, head->prev, head)表示在head->prev（双向循环链表的最后一个结点）和head之间添加一个新的结点。即表尾插入法（先插入的先输出，可以用来实现一个队列）。此处为表头前的第一个位置，由于是双向链表，故而可知为表尾。----》可能大家这里会没想明白。

    2.2 从链表中删除节点的函数

    /**

    * __list_del -

    * @prev:

    * @next:

    *

    * Delete a list entry by making the prev/next entries point to each other.

    *

    * This is only for internal list manipulation where we know the prev/next

    * entries

    */

    static __inline__ void __list_del（struct list_head * prev,

    struct list_head * next）

    {

    next->prev = prev;

    prev->next = next;

    }

    /**

    * list_del - deletes entry from list.

    * @entry: the element to delete from the list.

    * * Note: list_empty on entry does not return true after this, the entry is in

    * an undefined state.

    */

    static __inline__ void list_del（struct list_head *entry）

    {

    __list_del（entry->prev, entry->next）；

    }

    /**

    * list_del_init - deletes entry from list and reinitialize it.

    * @entry: the element to delete from the list.

    */

    static __inline__ void list_del_init（struct list_head *entry）

    {

    __list_del（entry->prev, entry->next）；

    INIT_LIST_HEAD（entry）；

    }

    这里简单说一下，list_del（struct list_head *entry）是从链表中删除entry节点。list_del_init（struct list_head *entry）不但从链表中删除节点，还把这个节点的向前向后指针都指向自己，即初始化。

    那么，我们怎么判断这个链表是不是空的呢！上面我说了，这里的双向链表都是有一个头节点，而我们上面看到，定义一个头节点时我们就初始化了，即它的prev和next指针都指向自己。所以这个函数是这样的。

    /**

    * list_empty - tests whether a list is empty

    * @head: the list to test.

    */

    static __inline__ int list_empty（struct list_head *head）

    {

    return head->next == head;

    }

    讲了这几个函数后，这又到了关键了，下面讲解的一个宏的定义就是对第一节中，我们所要说的为什么在一个结构中加入struct list_head变量就把这个结构变成了双向链表呢，这其中的关键就是怎么通过这个struct list_head变量来获取整个结构的变量，下面这个宏就为你解开答案：

    /**

    * list_entry - get the struct for this entry

    * @ptr: the &struct list_head pointer.

    * @type: the type of the struct this is embedded in.

    * @member: the name of the list_struct within the struct.

    */

    #define list_entry（ptr, type, member） \

    （（type *）（（char *）（ptr）-（unsigned long）（&（（type *）0）->member）））

    乍一看下，不知道这个宏在说什么，没关系，我举个例子来为你一一解答 :）
注：此处的实现与container_of 很像，看完下文还没理解的话，请看我的转载《container_of 理解》
    首先，我们还是用上面的结构：

    struct person

    {

    int age;

    int weight;

    struct list_head list;

    };

    我们一看到这样的结构就应该知道它定义了一个双向链表，下面来看下。

    我们有一个指针：

    struct list_head *pos;

    现在有这个指针，我们怎么去获得这个指针所在的结构的变量（即是struct person变量，其实是structperson指针）呢？看下面这样使用：

    struct person *one = list_entry（pos, struct person, list）；

    不明白是吧，展开一下 list_entry结构如下：

    （（struct person *

    （（char *）

    （pos） -

    （unsigned long）

    （&（（struct person *）0）->list）

    ）

    ）

    我慢慢的分解，首先分成两部分（char *）（pos）减去（unsigned long）（&（（struct person *）0）->list）然后转成（struct person *）类型的指针。（char *）（pos）：是将pos由struct list_head*转成char* ,这个好理解。（unsigned long）（&（（struct person *）0）->list）：先看最里面的（struct person *）0），它是把0地址转成struct person指针，然后（struct person *）0）->list就是指向list变量，之后是&（（struct person *）0）->list是取这个变量的地址，最后是（unsigned long）（&（（struct person*）0）->list）把这个变量的地址值变成一个整形数！

    这么复杂啊，其实说白了，这个（unsigned long）（&（（struct person *）0）->list）的意思就是取list变量在struct person结构中的偏移量。用个图形来说（unsigned long）（&（（struct person *）0）->list,如下：

    而（unsigned long）

    （&（（struct person *）0）->]list就是获取这个offset的值。

    （（char *）（pos） - （unsigned long）（&（（struct person *）0）->list））

    就是将pos指针往前移动offset位置，即是本来pos是struct list_head类型，它即是list.即是把pos指针往struct person结构的头地址位置移动过去，如上图的pos和虚箭头。

    当pos移到struct person结构头后就转成（struct person *）指针，这样就可以得到struct person*变量了。

    所以我们再回到前面的句子

    struct person *one = list_entry（pos, struct person, list）；

    //就是由pos得到pos所在的结构的指针，动物就可以这样：

    struct animal *one = list_entry（pos, struct animal, list）；

    下面我们再来看下和struct list_head相关的最后一个宏。

    2.3 list_head 的遍历的宏

    /**

    * list_for_each - iterate over a list

    * @pos: the &struct list_head to use as a loop counter.

    * @head: the head for your list.

    */

    #define list_for_each（pos, head） \

    for （pos = （head）->next; pos != （head）； pos = pos->next）

    /**

    * list_for_each_safe - iterate over a list safe against removal of list entry

    * @pos: the &struct list_head to use as a loop counter.

    * @n: another &struct list_head to use as temporary storage

    * @head: the head for your list.

    */

    #define list_for_each_safe（pos, n, head） \

    for （pos = （head）->next, n = pos->next; pos != （head）； \

    pos = n, n = pos->next）

    list_for_each（pos, head）是遍历整个head链表中的每个元素，每个元素都用pos指向。

    list_for_each_safe（pos, n, head）是用于删除链表head中的元素，不是上面有删除链表元素的函数了吗，为什么这里又要定义一个这样的宏呢。看下这个宏后面有个safe字，就是说用这个宏来删除是安全的，直接用前面的那些删除函数是不安全的。这个怎么说呢，我们看下下面这个图，有三个元素a ,b ,c.

    list_for_each（pos, myhead）

    {

    if （pos == b）

    {

    list_del_init（pos）；

    //break;

    }

    …

    }

    上面的算法是不安全的，因为当我们删除b后，如下图这样：

    上删除pos即b后，list_for_each要移到下一个元素，还需要用pos来取得下一个元素，但pos的指向已经改变，如果不直接退出而是在继续操作的话，就会出错了。而 list_for_each_safe就不一样了，如果上面的代码改成这样：

    struct list_head *pos, *n;

    list_for_each_safe（pos, n, myhead）

    {

    if （pos == b）

    {

    list_del_init（pos）；

    //break;

    }

    …

    }

    这里我们使用了n作为一个临时的指针，当pos被删除后，还可以用n来获得下一个元素的位置。

    说了那么多关于list_head的东西，下面应该总结一下，总结一下第一节想要解决的问题。

    三、总例

    我用一个程序来说明在struct person中增加了struct list_head变量后怎么来操作这样的双向链表。

    #include <stdio.h>

    #include "list.h"

    struct person

    {

    int age;

    int weight;

    struct list_head list;

    };

    int main（int argc, char* argv[]）

    {

    struct person *tmp;

    struct list_head *pos, *n;

    int age_i, weight_j;

    // 定义并初始化一个链表头

    struct person person_head;

    INIT_LIST_HEAD（&person_head.list）；

    for（age_i = 10, weight_j = 35; age_i < 40; age_i += 5, weight_j += 5）

    {

    tmp =（struct person*）malloc（sizeof（struct person））；

    tmp->age = age_i;

    tmp->weight = weight_j;

    // 把这个节点链接到链表后面

    // 这里因为每次的节点都是加在person_head的后面，所以先加进来的节点就在链表里的最

    后面

    // 打印的时候看到的顺序就是先加进来的就在最后面打印

    list_add（&（tmp->list）， &（person_head.list））；

    }

    // 下面把这个链表中各个节点的值打印出来

    printf（"\n"）；

    printf（"=========== print the list ===============\n"）；

    list_for_each（pos, &person_head.list）

    {

    // 这里我们用list_entry来取得pos所在的结构的指针

    tmp = list_entry（pos, struct person, list）；

    printf（"age:%d, weight: %d \n", tmp->age, tmp->weight）；

    }

    printf（"\n"）；

    // 下面删除一个节点中，age为20的节点

    printf（"========== print list after delete a node which age is 20

    ==========\n"）；

    list_for_each_safe（pos, n, &person_head.list）

    {

    tmp = list_entry（pos, struct person, list）；

    if（tmp->age == 20）

    {

    list_del_init（pos）；

    free（tmp）；

    }

    }

    list_for_each（pos, &person_head.list）

    {

    tmp = list_entry（pos, struct person, list）；

    printf（"age:%d, weight: %d \n", tmp->age, tmp->weight）；

    }

    // 释放资源

    list_for_each_safe（pos, n, &person_head.list）

    {

    tmp = list_entry（pos, struct person, list）；

    list_del_init（pos）；

    free（tmp）；

    }

    return 0;

    }

最后：此篇文章基本上是关于双向链表，所有最好有一定数据结构基础再看，会比较好，加深对数据结构知识运用。