进程切换之奥秘解析

来源：互联网发布：佳明connect下载软件编辑：程序博客网时间：2024/04/30 07:15

学号：SA12**6112

前面一篇博文分析了进程从用户态切换到内核态时，内核所做的主要的事，本文将研究在进程切换时，内核所做的事。

在内核态，进程切换主要分两步：

1：切换页全局目录

2：切换内核堆栈和硬件上下文

用prev指向被替换进程的表述符，next指向被激活进程的描述符

下面分析进程切换的第二步

第二步主要由switch_to宏实现：

3.3内核中X86体系下：/arch/x86/include/asm/system.h文件的第48行处：

 48 #define switch_to(prev, next, last)                                     \ 49 do {                                                                    \ 50         /*                                                              \ 51          * Context-switching clobbers all registers, so we clobber      \ 52          * them explicitly, via unused output variables.                \ 53          * (EAX and EBP is not listed because EBP is saved/restored     \ 54          * explicitly for wchan access and EAX is the return value of   \ 55          * __switch_to())                                               \ 56          */                                                             \ 57         unsigned long ebx, ecx, edx, esi, edi;                          \ 58                                                                         \ 59         asm volatile("pushfl\n\t"               /* save    flags */     \ 60                      "pushl %%ebp\n\t"          /* save    EBP   */     \ 61                      "movl %%esp,%[prev_sp]\n\t"        /* save    ESP   */ \ 62                      "movl %[next_sp],%%esp\n\t"        /* restore ESP   */ \ 63                      "movl $1f,%[prev_ip]\n\t"  /* save    EIP   */     \ 64                      "pushl %[next_ip]\n\t"     /* restore EIP   */     \ 65                      __switch_canary                                    \ 66                      "jmp __switch_to\n"        /* regparm call  */     \ 67                      "1:\t"                                             \ 68                      "popl %%ebp\n\t"           /* restore EBP   */     \ 69                      "popfl\n"                  /* restore flags */     \ 70                                                                         \ 71                      /* output parameters */                            \ 72                      : [prev_sp] "=m" (prev->thread.sp),                \ 73                        [prev_ip] "=m" (prev->thread.ip),                \ 74                        "=a" (last),                                     \ 75                                                                         \ 76                        /* clobbered output registers: */                \ 77                        "=b" (ebx), "=c" (ecx), "=d" (edx),              \ 78                        "=S" (esi), "=D" (edi)                           \ 79                                                                         \ 80                        __switch_canary_oparam                           \ 81                                                                         \ 82                        /* input parameters: */                          \ 83                      : [next_sp]  "m" (next->thread.sp),                \ 84                        [next_ip]  "m" (next->thread.ip),                \ 85                                                                         \ 86                        /* regparm parameters for __switch_to(): */      \ 87                        [prev]     "a" (prev),                           \ 88                        [next]     "d" (next)                            \ 89                                                                         \ 90                        __switch_canary_iparam                           \ 91                                                                         \ 92                      : /* reloaded segment registers */                 \ 93                         "memory");                                      \ 94 } while (0)

一：由上面的代码可以看出，切换内核堆栈主要工作是：

1：把eflags和ebp寄存器保存到prev内核栈中。

2：把esp保存到prev->thread.sp中，eip保存到prev->thread.ip中。

3：把next指向的新进程的thread.esp保存到esp中，把next->thread.ip保存到eip中

至此已经完成了内核堆栈的切换。

二：切换内核堆栈之后，TSS段也要相应的改变：

这是因为对于linux系统来说同一个CPU上所有的进程共用一个TSS，进程切换了，因此TSS需要随之改变。

linux系统中主要从两个方面用到了TSS：

(1)任何进程从用户态陷入内核态都必须从TSS获得内核堆栈指针

(2)用户态读写IO需要访问TSS的权限位图。

所以在进程切换时也要更新TSS中的esp0和IO权位图的值，这主要在_switch_to函数中完成：

3.3内核X86体系下：/arch/x86/kernel/process_32.c文件中第296行处：

296 __notrace_funcgraph struct task_struct *297 __switch_to(struct task_struct *prev_p, struct task_struct *next_p)298 {299         struct thread_struct *prev = &prev_p->thread,300                                  *next = &next_p->thread;301         int cpu = smp_processor_id();302         struct tss_struct *tss = &per_cpu(init_tss, cpu);303         fpu_switch_t fpu;304 305         /* never put a printk in __switch_to... printk() calls wake_up*() indirectly */306 307         fpu = switch_fpu_prepare(prev_p, next_p, cpu);308 309         /*310          * Reload esp0.311          */312         load_sp0(tss, next);313 314         /*315          * Save away %gs. No need to save %fs, as it was saved on the316          * stack on entry.  No need to save %es and %ds, as those are317          * always kernel segments while inside the kernel.  Doing this318          * before setting the new TLS descriptors avoids the situation319          * where we temporarily have non-reloadable segments in %fs320          * and %gs.  This could be an issue if the NMI handler ever321          * used %fs or %gs (it does not today), or if the kernel is322          * running inside of a hypervisor layer.323          */324         lazy_save_gs(prev->gs);325 326         /*327          * Load the per-thread Thread-Local Storage descriptor.328          */329         load_TLS(next, cpu);330 331         /*332          * Restore IOPL if needed.  In normal use, the flags restore333          * in the switch assembly will handle this.  But if the kernel334          * is running virtualized at a non-zero CPL, the popf will335          * not restore flags, so it must be done in a separate step.336          */337         if (get_kernel_rpl() && unlikely(prev->iopl != next->iopl))338                 set_iopl_mask(next->iopl);339 340         /*341          * Now maybe handle debug registers and/or IO bitmaps342          */343         if (unlikely(task_thread_info(prev_p)->flags & _TIF_WORK_CTXSW_PREV ||344                      task_thread_info(next_p)->flags & _TIF_WORK_CTXSW_NEXT))345                 __switch_to_xtra(prev_p, next_p, tss);346 347         /*348          * Leave lazy mode, flushing any hypercalls made here.349          * This must be done before restoring TLS segments so350          * the GDT and LDT are properly updated, and must be351          * done before math_state_restore, so the TS bit is up352          * to date.353          */354         arch_end_context_switch(next_p);355 356         /*357          * Restore %gs if needed (which is common)358          */359         if (prev->gs | next->gs)360                 lazy_load_gs(next->gs);361 362         switch_fpu_finish(next_p, fpu);363 364         percpu_write(current_task, next_p);365 366         return prev_p;367 }

由上面的代码可看出：TSS的更新主要是

1： load_sp0(tss, next); 从下一个进程的thread字段中获取它的sp0，并用它来更新TSS中的sp0

2： __switch_to_xtra(prev_p, next_p, tss);必要的时候会更新IO权位值。