Linux2.6内核--对块IO层操作的讨论

来源：互联网发布：中国电信网络套餐编辑：程序博客网时间：2024/05/20 20:01

当一个块被调入内存时（也就是说，在读入后或等待写出时），它要存储在缓冲区中。每个缓冲区与一个块对应，它相当于是磁盘块在内存中的表示。块包含一个或多个扇区，但大小不能超过一页，所以一页可以容纳一个或多个内存块。由于内核在处理数据时需要一些相关的控制信息（比如块属于哪个设备，块对应于哪个缓冲区），所以，每一个缓冲区都有一个对应的描述符。该描述符用 buffer_head 结构体表示，称作缓冲区头，在文件 <linux/buffer_head.h> 中定义，它包含了内核操作缓冲区的全部信息。

下面给出缓冲区头结构体和其中每一个域的说明：

struct buffer_head {
unsigned long b_state; /* buffer state bitmap (see above) */
struct buffer_head *b_this_page;/* circular list of page's buffers */
struct page *b_page; /* the page this bh is mapped to */

sector_t b_blocknr; /* start block number */
size_t b_size; /* size of mapping */
char *b_data; /* pointer to data within the page */

struct block_device *b_bdev;
bh_end_io_t *b_end_io; /* I/O completion */
void *b_private; /* reserved for b_end_io */
struct list_head b_assoc_buffers; /* associated with another mapping */
struct address_space *b_assoc_map; /* mapping this buffer is
associated with */
atomic_t b_count; /* users using this buffer_head */
};

缓冲区头的目的在于描述磁盘块和物理内存缓冲区之间的映射关系。这个结构体在内核中只是扮演一个描述符的角色，说明从缓冲区到块的映射关系。

在2.6内核以前，缓冲区头的作用比现在还要重要。因为缓冲区头作为内核中的 IO 操作单元，不仅仅描述了从磁盘块到物理内存的映射，而且还是所有块 IO 操作的容器。但是在2.6内核以后改变了这种策略，它使用一个新的结构 -- bio 来作为操作容器。

bio 结构体定义于 <linux/bio.h> 中，下面给出 bio 结构体和每个域的描述：

struct bio {
sector_t bi_sector; /* associated sector on disk */
struct bio *bi_next; /* list of requests */
struct block_device *bi_bdev; /* associated block device */
unsigned long bi_flags; /* status and command flags */
unsigned long bi_rw; /* read or write? */
unsigned short bi_vcnt; /* number of bio_vecs off */
unsigned short bi_idx; /* current index in bi_io_vec */
unsigned short bi_phys_segments; /* number of segments after coalescing */
unsigned short bi_hw_segments; /* number of segments after remapping */
unsigned int bi_size; /* I/O count */
unsigned int bi_hw_front_size; /* size of the first mergeable segment */
unsigned int bi_hw_back_size; /* size of the last mergeable segment */
unsigned int bi_max_vecs; /* maximum bio_vecs possible */
struct bio_vec *bi_io_vec; /* bio_vec list */
bio_end_io_t *bi_end_io; /* I/O completion method */
atomic_t bi_cnt; /* usage counter */
void *bi_private; /* owner-private method */
bio_destructor_t *bi_destructor; /* destructor method */
};

使用 bio 结构体的目的主要是代表正在现场执行的 IO 操作，所以该结构体中的主要域都是用来管理相关信息的，其中最重要的几个域是 bi_io_vecs , bi_vcnt 和 bi_idx 。下图显示了 bio 结构体及其他结构体之间的关系。

说明：每一个块 IO 请求都通过一个 bio 结构体表示。每个请求包含一个或多个块，这些块存储在 bio_vec 结构体数组中。这些结构体描述了每个片段在物理页中的实际位置，并且像向量一样被组织在一起。 IO 操作的第一个片段由 b_io_vec 结构体所指向，其他的片段在其后依次放置，共有 bi_vcnt个片段。当块 IO 层开始执行请求，需要使用各个片段时， bi_idx 域会不断更新，从而指向当前片段。

新老方法的比较：

缓冲区头和新的 bio 结构体之间存在显著差别。bio 结构体代表的是 IO操作，它可以包括内存中的一个或多个页；而另一方面，buffer_head 结构体代表的是一个缓冲区，它描述的仅仅是磁盘中的一个块，所以他可能会引起不必要的分割，将请求按块为单位划分，只能靠以后再重新组合。由于 bio 结构体是轻量级的，它描述的块可以不需要连续存储区，并且不需要分割 IO 操作。