【xv6学习之lab3】User Environment

来源：互联网发布：文档拍照识别软件编辑：程序博客网时间：2024/04/30 23:14

今天是2016年1月20日，距离回家过年不到10天了，我要尽快把这个lab做完，任务很艰巨！(真是呵呵了，现在已经是2月21日了，我还在写。。。)

注意：文中trap 在有的地方被认为是 Exception，特别是在与Interrupt平行出现时。

Part A: User Environments and Exception Handling

首先是如下三个变量：

kern/env.c

然后我们需要对 struct Env 有个较细致的理解：

inc/env.h

Allocating the Environments Array

这部分完全仿照 pages 即可。如下：

kern/pmap.c

Creating and Running Environments

要知道到目前为止，JOS是没有文件系统的，那么如果在JOS上运行用户程序怎么办？——直接和内核镜像结合在一起，在编译时期就做好。后面lab4就会让我们自己完善好文件系统．

首先是 env_init() 函数：

接着是 env_setup_vm() 函数：

需要注意的是：

1）可以看出使用了 page2kva() 函数转化地址，也即默认当前 p 是在 0 ~ 256MB 的物理页上(注意 mem_init()函数将 256MB 的物理页映射到虚拟地址 0xf0000000 ~ 0xffffffff)。

2）拷贝从虚拟地址 kern_pgdir 开始的 PGSIZE 大小的内容到新的PD e->env_pgdir。

然后是 region_alloc() 函数：

开始我是这样写的，回头发现有问题，比如当 va 页不对齐时，而 le 刚好为一页时，此时可知实际分配了 ROUNDUP(len, PGSIZE) = PGSIZE ，不过是从 ROUNDDOWN(va, PGSIZE) 开始的，但是实际在使用时我们从虚拟地址 va 开始使用，也即此时我们分配的物理大小是不够用的。因此我们应该参照上面注释说的方法。如下：

这里需要注意区分与(static)静态映射概念，静态映射是指没有分配实际的物理页，通过boot_map_region()执行而非 page_insert()。boot_map_region()
的操作空间是内核虚拟地址空间，它提供的映射是静态映射，不涉及物理页的分配。而 page_alloc() 则是要对实际的物理页面分配映射到当前用户的虚拟地址空间中。(看了下代码，感觉没太多差别，主要是 page_insert() 函数会为新映射的 page 的pp_ref加一，而boot_map_region()不会)

接着是 load_icode() :

这部分代码相对陌生些，涉及到了 ELF 文件。这部分代码基本就是抄抄抄，很多地方还不懂。。

JOS到现在为止还没有文件系统，所以为了测试我们能运行用户程序，现在的做法是将用户程序编译以后和内核链接到一起(即用户程序紧接着内核后面放置)。所以这个函数的作用就是将嵌入在内核中的用户程序取出释放到相应链接器指定好的用户虚拟空间里。这里的binary指针，就是用户程序在内核中的开始位置的虚拟地址。

按照注释的提示，我们可以参照boot/main.c来完成相应的载入，但是有几个地方需要注意：

1、对于用户程序ELF文件的每个程序头ph，ph→p_memsz和ph→p_filesz是两个概念，前者是该程序头应在内存中占用的空间大小，而后者是实际该程序头占用的空间大小。它们俩的区别就是ELF文件中BSS节中那些没有被初始化的静态变量，这些变量不会被分配文件储存空间，但是在实际载入后，需要在内存中给与相应的空间，并且全部初始化为0。所以具体来讲，就是每个程序段ph，总共占用p_memsz的内存，前面p_filesz的空间从binary的对应内存复制过来，后面剩下的空间全部清0。

2、ph→p_va是该程序段应该被放入的虚拟空间地址，但是注意，在这个时候，虚拟地址空间应该是用户环境Env的虚拟地址空间。可是，在进入 load_icode() 时，是内核态进入的，所以虚拟地址空间还是内核的空间。所以我们使用 lcr3(PADDR(e->env_pgdir)) 指令载入用户环境的PD。其中的 e->env_pgdir 是在 env_setup_vm() 函数里面设置好的。但是仍要小心的是，对于ELF载入完毕以后，我们就不需要对用户空间进行操作了，所以在函数的最后要重新切回到内核虚拟地址空间来。

3、注释中还提到了要对程序的入口地址作一定的设置，这里对应的操作是 e->env_tf.tf_eip = elfhdr->e_entry 这里涉及到对struct Trapframe 结构的具体介绍，我们留到下一个函数 env_create() 的时候进行详细介绍。

还需要注意的是指针的计算。需要注意的是：void * 指针加一时，其值就是加一，同理uint8_t * 也是这样。所以以上的写法的结果与下面一样，但是下面写法才是对的，因为其应该是指针计算，而上面的写法在计算时ph->p_va类型是整型。

或者我们也可以使用这种写法：

这部分我也纠结了好久。。一定小心指针计算！

下面是 env_creat() 函数：

这个相对简单，注意 env_alloc() 函数里已经运行过 env_setup_vm()。

最后是 env_run() 函数：

这里的env_pop_tf实现了进程的真正切换，原理就是依据之前进程已经设置好的trapframe，然后把这个进程保存好的属于自己的trapframe通过弹栈的形式，输出到各个寄存器当中，实现进程环境的替换，而这里面也包括 eip，也就意味着，当从env_pop_tf里面的iret返回的时候，就开始从调用结构体e描述的进程开始运行了。

完成以上代码后，make 之后会发生 Triple fault 错误，不要怕，正常的。。。

在MIT的课程材料上解释了这样的原因。因为我们没有对中断表进行相应的设置，以至于用户程序在调用系统终端输出字符时产生了错误。但是我们需要认为的确认一下是否真的错误是由中断而不是其他设置造成的，所以我们启动GDB调试，选择在 env_pop_tf() 函数停下:

擦，今天新学一招，原来可以这样设置断点：　b 　函数名

好便利。如下两种断点设置方法(都怪当初不仔细看说明)：

从这里开始单步跟踪，在 IRET 指令之前停下来，我们在这里查看寄存器的信息看是否都被设置好了：

从EAX、ECX等寄存器中看到都被清0了，DS,ES寄存器内容为0x23，这个和我们在 env_alloc() 中看到的设置是一致的，但是在IRET执行之前CS和EIP两个寄存器都还看不到，不过没有关系，我们知道栈顶的接下来三个DWORD分别为EIP、CS和EFLAGS，我们查看一下栈顶的这三个DWORD：

可以看到EIP的值为0x00800020即用户程序的入口地址，我们可以打开user/user.ld文件查看一下:

发现是符合的。这就说明我们正确的将入口地址加载进来了,接下来我们看看是否正确载入了用户程序的ELF文件:

实际的用户程序hello的汇编代码可以在obj/user/hello.asm中找到:

发现是一致的，从这里可以知道我们的 load_icode() 的载入是正常工作的。

我们找到MIT教材中提到的 sys_cputs() 函数中的中断指令在用户程序中的位置:

发现中断调用的地址为0x800bdd，我们尝试着在这里设下断点，看JOS能否运行到这里:

可以看到JOS成功运行到了该断点，再执行一条指令，EIP没有发生变化，这个时候看QEMU的输出信息,发现已经产生Triple fault:

所以到目前为止，我们的JOS运行一切正常。

好吧，又要补知识了。。

看了一遍，但感觉还是晕啊。先往下看下吧

Basics of Protected Control Transfer

Exceptions and interrupts are both "protected control transfers," which cause the processor to switch from user to kernel mode (CPL=0) without giving the user-mode code any opportunity to interfere with the functioning of the kernel or other environments.In Intel's terminology, an interrupt is a protected control transfer that is caused by an asynchronous event usually external to the processor, such as notification of external device I/O activity. An exception, in contrast, is a protected control transfer caused synchronously by the currently running code, for example due to a divide by zero or an invalid memory access.

In order to ensure that these protected control transfers are actually protected, the processor's interrupt/exception mechanism is designed so that the code currently running when the interrupt or exception occurs does not get to choose arbitrarily where the kernel is entered or how. Instead, the processor ensures that the kernel can be entered only under carefully controlled conditions. On the x86, two mechanisms work together to provide this protection:

１、The Interrupt Descriptor Table. The processor ensures that interrupts and exceptions can only cause the kernel to be entered at a few specific, well-defined entry-points determined by the kernel itself, and not by the code running when the interrupt or exception is taken.

The x86 allows up to 256 different interrupt or exception entry points into the kernel, each with a different interrupt vector. A vector is a number between 0 and 255. An interrupt's vector is determined by the source of the interrupt: different devices, error conditions, and application requests to the kernel generate interrupts with different vectors. The CPU uses the vector as an index into the processor's interrupt descriptor table (IDT), which the kernel sets up in kernel-private memory, much like the GDT. From the appropriate entry in this table the processor loads:

the value to load into the instruction pointer (EIP) register, pointing to the kernel code designated to handle that type of exception.
the value to load into the code segment (CS) register, which includes in bits 0-1 the privilege level at which the exception handler is to run. (In JOS, all exceptions are handled in kernel mode, privilege level 0.)

２、The Task State Segment. The processor needs a place to save the old processor state before the interrupt or exception occurred, such as the original values of EIP and CS before the processor invoked the exception handler, so that the exception handler can later restore that old state and resume the interrupted code from where it left off. But this save area for the old processor state must in turn be protected from unprivileged user-mode code; otherwise buggy or malicious user code could compromise the kernel

For this reason, when an x86 processor takes an interrupt or trap that causes a privilege level change from user to kernel mode, it also switches to a stack in the kernel's memory. A structure called the task state segment (TSS) specifies the segment selector and address where this stack lives. The processor pushes (on this new stack) SS, ESP, EFLAGS, CS, EIP, and an optional error code. Then it loads the CS and EIP from the interrupt descriptor, and sets the ESP and SS to refer to the new stack.

TSS 的定义可以在 inc/mmu.h 里看到：

对应的结构图如下：

Although the TSS is large and can potentially serve a variety of purposes, JOS only uses it to define the kernel stack that the processor should switch to when it transfers from user to kernel mode.Since "kernel mode" in JOS is privilege level 0 on the x86, the processor uses the ESP0 and SS0 fields of the TSS to define the kernel stack when entering kernel mode. JOS doesn't use any other TSS fields.

Types of Exceptions and Interrupts

All of the synchronous exceptions that the x86 processor can generate internally use interrupt vectors between 0 and 31, and therefore map to IDT entries 0-31. For example, a page fault always causes an exception through vector 14. Interrupt vectors greater than 31 are only used by software interrupts, which can be generated by the int instruction, or asynchronous hardware interrupts, caused by external devices when they need attention.

然后仔细看懂那个异常的例子。

user mode 下发生中断或异常的一般压栈(在kernel stack里)情况：

有的情况下还会压栈 error code(后面我们知道 error code 都会有，就算没有也要用补齐0来占位)，如下：

Nested Exceptions and Interrupts

The processor can take exceptions and interrupts both from kernel and user mode. It is only when entering the kernel from user mode, however, that the x86 processor automatically switches stacks before pushing its old register state onto the stack and invoking the appropriate exception handler through the IDT. If the processor is already in kernel mode when the interrupt or exception occurs (the low 2 bits of the CS register are already zero), then the CPU just pushes more values on the same kernel stack. In this way, the kernel can gracefully handle nested exceptions caused by code within the kernel itself. This capability is an important tool in implementing protection, as we will see later in the section on system calls.

If the processor is already in kernel mode and takes a nested exception, since it does not need to switch stacks, it does not save the old SS or ESP registers. For exception types that do not push an error code, the kernel stack therefore looks like the following on entry to the exception handler:

There is one important caveat to the processor's nested exception capability. If the processor takes an exception while already in kernel mode, and cannot push its old state onto the kernel stack for any reason such as lack of stack space, then there is nothing the processor can do to recover, so it simply resets itself. Needless to say, the kernel should be designed so that this can't happen.

Setting Up the IDT

The header files inc/trap.h and kern/trap.h contain important definitions related to interrupts and exceptions that you will need to become familiar with. The file kern/trap.h contains definitions that are strictly private to the kernel, while inc/trap.h contains definitions that may also be useful to user-level programs and libraries.

首先认识下 IDT 数据结构：

kern/trap.c

其中idt对应 IDT，idt_pd是系统寄存器IDTR的对应结构。

这两种结构体定义如下：

inc/mmu.h

注意这里用到了冒号的位域用法。

这里的一个宏定义将会是有用的：

好了，这个exercise的流程可以分为两步：
1. 在kern/trapentry.S中定义好每个中断对应的中断处理程序
2. 在kern/trap.c的 idt_init() 中将那些第一步定义好的中断处理程序安装进IDT

首先开始定义中断处理程序，根据MIT的材料，每个interrupt handler都必须要作的事就是在内核栈中设置好一个Trapframe的布局结构，然后将这个结构传给 trap() 进行进一步处理，最后在 trap dispatch() 中进行具体中断处理程序的分发。

在kern/trapentry.S中JOS提供了两个很好用的宏给我们：

他们的功能就是接收一个函数名和对应处理的中断向量编号，然后定义出一个相应的以该函数名命名的中断处理程序。这样的中断向量程序的执行流程就是向栈里压入相关错误码和中断号，然后跳转到 _alltraps 来执行共有的部分。

对于_alltrap 处的处理，这是所有 trap handler 所共同执行的代码，在调用中断处理程序之前必须向栈中压入一个 struct Trapframe*变量，所以根据Trapframe 的结构按照倒序向栈中压入所需的寄存器，注意到函数调用的时候对于结构体成员的寻址是按照首地址加上偏移量进行的，此处偏移量按照定义顺序依次增大，而向栈中压入变量时栈从高地址向低地址生长，所以倒序压入正好满足函数寻址时 Trapframe 结构的格式要求。（注意此处的压栈都是在 kernel stack 里进行）

这里牵涉到一个重要的问题，就是错误代码，如果是系统运行中产生的中断，根据不同的中断类型，在切换完栈以后，处理器会向栈中放入一个错误代码。比如8号中断Double Fault，但是比如0号Divide Zero就不会放。特别注意，当用户使用int指令手动调用中断时，处理器是不会放入错误代码的(很明显，你不会故意想错误地调用一个中断把)，这个细节在后面会用到。

所以在系统没有放入错误码时，我们的中断处理程序就要手动补齐这个空间了（补0）。 TRAPHANDLER_NOEC 宏就是帮我们完成这个事情的，按照注释和材料的提示即可完成，关于每个中断是否有错误码请参考华中科大邵老师的讲义Chapter 05中的第5.4.2中的那张图（如下），在压栈时注意数据的长度大小选择对应的指令，其他就没有什么需要注意的。

具体中断处理程序生成代码如下:

kern/trapentry.S

接着要填补 trap_init() 函数，此函数用来初始化中断向量的处理机制：

此处我们将使用 SETGATE(gate, istrap, sel, off, dpl) 这个宏来完成，需要重点注意istrap以及dpl这两个参数。(注意此处当 istrap = 1，表示是异常， istrap = 0 时，表示是中断)

关于 sel ：

我们设置 sel 为内核代码段 GD_KT

关于 istrap：

我们知道

The NMI and the exceptions recognized by the processor are assigned predetermined identifiers in the range 0 through 31. Not all of these numbers are currently used by the 80386; unassigned identifiers in this range are reserved by Intel for possible future expansion.The identifiers of the maskable interrupts are determined by external interrupt controllers and communicated to the processor during the processor's interrupt-acknowledge sequence.

即中断标号在0~31之间的是NMI和异常，不过并非0~31所有标号都被使用，有的标号被保留，在JOS里，通过注释等可以知道标号9，15，20~31被保留。32以后的都是中断。

JOS给出的中断向量标号如下：

inc/trap.h

通过上图，我们可以更详细得知道:

“第一段”：是处理器已经定义好的，为0-19号

“第二段”：定义了T_SYSCALL （系统调用）, T_DEFAULT
这 “两段” 都是算软中断，后面是硬件触发的中断
"第三段”：定义各种硬件中断，这些硬件中断的计算方式是 IRQ_OFFSET + IRQ_Number

关于 dpl：我们将 T_BRKPT 以及 T_SYSCALL 设置为3，其余都为0。参考张弛的文档说明：就拿除零为例，你肯定不想让用户int 0这么毫无意义的调用。所以0号中断只能由level 0的内核产生(运行时抛出)，但是调试是例外，应该能让用户自发调用。对于 T_SYSCALL 还不清楚。

最开始的一堆函数声明参照 kern/trapentry.S 里的注释

完成的 trap_init() 函数如下：

kern/trap.c

到这里，我们的中断响应机制就建立起来了。根据代码，如果一个除零中断被捕获，会转到kern/trapentry.S中的 routine_divide() , 然后跳转到 _alltraps ，接着是 kern/trap.c 中的 trap() ：

从代码中可以看到，最终程序会进入 trap_dispatch() 打印出寄存器信息，那么我们尝试着运行一个有除零错误的用户程序试试，将kern/init.c中载
入的第一个程序设置为user_divzero:

接下来编译启动QEMU，可以看到正确的处理画面：

此时，我们来测试一下，调用评分，make grade 一下：

此处省略Ｎ行

可以看到Part A的分数都拿到了，至此Part A就全部完成了，欢呼一下！

整理下思绪，总结出整个过程如下：

宏 TRAPHANDLER 和 TRAPHANDLER_NOEC 完成 trapentry.S 里面的各 handler 的设置。

宏 SETGATE 设置 IDT，完成将 IDT中各部分与其对应 handler 的地址映射。

１、这是因为不同的中断对应不同的中断号。

２、出现这个的原因是当前的 Page fault handler 的调用权限是 0，只能由系统调用，故我们直接在用户环境 softint 下调用就会产生 General Protection fault 权限错误。如下所示：

如果我们此时将 Page fault 的权限设置为3，得到以下结果，但请注意If allowed to directly call the INT 14 (page fault), the user can check without a kernel permission to allocate memory, which is a big loophole.：

查阅中断向量的描述我们就可以知道 Page fault 中断是需要压入错误代码的！但是前面我们已经说过，用户用 int 指令调用中断是不会压入错误代码的。可是我们在 kern/trapentry.S 中为 Page fault 指定的中断处理程序默认认为系统为我们放入了错误码，所以不会补齐。那么当我们用int调用中断处理程序造成的后果是什么？栈中没有放入错误码 ! ! !

请注意上面打印出的信息，关于 err 开始，其实就发生了错位，err 是原本 eip 的值 0x00800038（是不是很眼熟？可以与前面那幅图对比下）下面都是依次错位的，原因就是栈中没有放入错误码。这里你可能有个疑问，为什么之前 Page fault 和 Divide Error 都是对的。Divide Error 正确是因为该中断本来就没有 Error code ，需要手动添加， Page fault 正确是因为执行 INT $(14) 时发生了权限异常，系统自行产生 General Protection fault ，系统会自行补充 Error code。总之，这两个不同的原因导致其栈里都在 Error code 所在位置填充了内容。

还记得前面说过内核栈的压入结构要对应 Trapframe 么？如果少了一个成员，我们再把这个 Trapframe 传到 trap () 中进行处理，那么在访问 Trapframe 中的最后一个 DWORD (也就是访问 ss 寄存器时)，肯定就访问到 KSTACKTOP 之上的空间上去了！！在 inc/memlayout.h 中可以看到，KSTACKTOP 上的空间为 KERNBASE，该部分正好对应实际物理地址0x00000000，所以打印的 ss 实际是物理地址 0x00000000 处的内容。

Part B: Page Faults, Breakpoints Exceptions, and System Calls

Handling Page Faults

When the processor takes a page fault, it stores the linear (i.e., virtual) address that caused the fault in a special processor control register, CR2. In trap.c we have provided the beginnings of a special function, page_fault_handler(), to handle page fault exceptions.

The Breakpoint Exception

The breakpoint exception, interrupt vector 3 (T_BRKPT), is normally used to allow debuggers to insert breakpoints in a program's code by temporarily replacing the relevant program instruction with the special 1-byte int3 software interrupt instruction.The user-mode implementation of panic() in lib/panic.c, for example, performs an int3 after displaying its panic message.

都比较简单，代码如下：

３、在当前 breakpoint handler entry 的 dpl 设置为３的情况下，我们得到 break point exception，如果我们把其设置为0，就会是 general protection fault。

４、没看懂ｏ（╯□╰）ｏ

System calls

User processes ask the kernel to do things for them by invoking system calls. When the user process invokes a system call, the processor enters kernel mode, the processor and the kernel cooperate to save the user process's state, the kernel executes appropriate code in order to carry out the system call, and then resumes the user process.

In the JOS kernel, we will use the int instruction, which causes a processor interrupt. In particular, we will use int $0x30 as the system call interrupt.

The application will pass the system call number and the system call arguments in registers. This way, the kernel won't need to grub around in the user environment's stack or instruction stream. The system call number will go in %eax, and the arguments (up to five of them) will go in %edx, %ecx, %ebx, %edi, and %esi, respectively. The kernel passes the return value back in %eax.

这个 Exercise 让我们可以开始处理系统中断，根据提示，首先修改 kern/trap.c 中的 idt_init() 以及 kern/trapentry.S 添加相应的中断服务程序和中断向量，然后修改 kern/trap.c:

最后修改 kern/syscall.c 里面的代码（注意与 lib/syscall.c 区分！）

这里注意给 ret 初始化，因为 ret 是要返回的，对后面的过程很重要。

make run-hello 以及 make grade 显示正确。

User-mode startup

A user program starts running at the top of lib/entry.S. After some setup, this code calls libmain(), in lib/libmain.c. You should modify libmain() to initialize the global pointer thisenv to point at this environment's struct Env in the envs[] array. (Note that lib/entry.S has already defined envs to point at the UENVS mapping you set up in Part A.) Hint: look in inc/env.h and use sys_getenvid.

代码就一句：

lib/libmain.c

按照题意完成，需要注意的就是宏 ENVX 的运用，因为使用 sys_getenvid() 得到的并不是真正的 id，还有些附加项。如下所示：

inc/env.h

完成后 make run-hello：

继续，胜利就在眼前！

Page faults and memory protection

Memory protection is a crucial feature of an operating system, ensuring that bugs in one program cannot corrupt other programs or corrupt the operating system itself.

System calls present an interesting problem for memory protection. Most system call interfaces let user programs pass pointers to the kernel. These pointers point at user buffers to be read or written. The kernel then dereferences these pointers while carrying out the system call. There are two problems with this:

１、A page fault in the kernel is potentially a lot more serious than a page fault in a user program. If the kernel page-faults while manipulating its own data structures, that's a kernel bug, and the fault handler should panic the kernel (and hence the whole system). But when the kernel is dereferencing pointers given to it by the user program, it needs a way to remember that any page faults these dereferences cause are actually on behalf of the user program.
２、The kernel typically has more memory permissions than the user program. The user program might pass a pointer to a system call that points to memory that the kernel can read or write but that the program cannot. The kernel must be careful not to be tricked into dereferencing such a pointer, since that might reveal private information or destroy the integrity of the kernel.

For both of these reasons the kernel must be extremely careful when handling pointers presented by user programs.

You will now solve these two problems with a single mechanism that scrutinizes all pointers passed from userspace into the kernel. When a program passes the kernel a pointer, the kernel will check that the address is in the user part of the address space, and that the page table would allow the memory operation.

Thus, the kernel will never suffer a page fault due to dereferencing a user-supplied pointer. If the kernel does page fault, it should panic and terminate.