linux----进程切换

来源：互联网发布：淘宝修改折扣编辑：程序博客网时间：2024/06/01 14:51

进程切换

一、进程调度

1、调度时机

1）自愿的调度随时可以进行

##内核空间：一个进程可以通过schedule()启动一次调度/在调用该函数之前，将本进程的状态设置为TASK_INTERRUPIBLE 或 TASK_UNINTERRUPTIBLE，暂时放弃运行进入睡眠；为自愿的暂时放弃运行加上时间限制的系统调用：schedule_timeout()

##用户空间：通过系统调用 pause()；为自愿的暂时放弃运行加上时间限制的系统调用：nanosleep()

两者的区别：自愿放弃运行在用户空间可见，在内核空间不可见

2）非自愿：这种调度发生在进程从系统空间返回到用户空间的前夕

二、进程切换

1、硬件支持：Intel在i386系统结构中增设任务状态段TSS，保存要切换的进程大的所有信息，计划为每一个进程准备一个TSS，将它存放在TR寄存器中；但是Linux内核并没有这么做，每一个CPU拥有一个TSS，一经装入就不再变了，原因在于：改变TSS中的SS0和ESP0所花的开销比通过装入TR以更换一个TSS要小得多。因为不是为每个进程分配TSS，所以配替换的进程的硬件上下文是保存在进程描述符中的thread_struct类型的字段。

TSS段的结构在Processor.h (include\asm-i386) 中，如下定义：

struct tss_struct {

unsigned short back_link,__blh;

unsigned long esp0;

unsigned short ss0,__ss0h;

unsigned long esp1;

unsigned short ss1,__ss1h; /* ss1 is used to cache MSR_IA32_SYSENTER_CS */

unsigned long esp2;

unsigned short ss2,__ss2h;

unsigned long __cr3;

unsigned long eip;

unsigned long eflags;

unsigned long eax,ecx,edx,ebx;

unsigned long esp;

unsigned long ebp;

unsigned long esi;

unsigned long edi;

unsigned short es, __esh;

unsigned short cs, __csh;

unsigned short ss, __ssh;

unsigned short ds, __dsh;

unsigned short fs, __fsh;

unsigned short gs, __gsh;

unsigned short ldt, __ldth;

unsigned short trace, io_bitmap_base;

* The extra 1 is there because the CPU will access an

* additional byte beyond the end of the IO permission

* bitmap. The extra byte must be all 1 bits, and must

* be within the limit.

unsigned long io_bitmap[IO_BITMAP_LONGS + 1];

* Cache the current maximum and the last task that used the bitmap:

unsigned long io_bitmap_max;

struct thread_struct *io_bitmap_owner;

* pads the TSS to be cacheline-aligned (size is 0x100)

unsigned long __cacheline_filler[35];

* .. and then another 0x100 bytes for emergency kernel stack

unsigned long stack[64];

} __attribute__((packed));

在文件Processor.h (include\asm-i386) 中给出了TSS定义

#define INIT_TSS { \

.esp0 = sizeof(init_stack) + (long)&init_stack, \

.ss0 = __KERNEL_DS, \

.ss1 = __KERNEL_CS, \

.ldt = GDT_ENTRY_LDT, \

.io_bitmap_base = INVALID_IO_BITMAP_OFFSET, \

.io_bitmap = { [ 0 ... IO_BITMAP_LONGS] = ~0 }, \

}

2、执行进程切换

进程切换由schedule()函数执行的，主要分为两步

1）切换页全局目录（Page Global Directory）来加载一个新的地址空间，实际上就是加载新进程的cr3寄存器值。

2）切换内核堆栈和硬件上下文，这些包含了内核执行一个新进程的所有信息，包含了CPU寄存器。

3、linux进程的切换过程

最一般的情况：正在运行的用户态进程X切换到运行用户态进程Y的过程

- 正在运行的用户态进程X

- 发生中断——save cs:eip/esp/eflags(current) to kernel stack,then load cs:eip(entry of a specific ISR) and ss:esp(point to kernel stack).

- SAVE_ALL //保存现场

- 中断处理过程中或中断返回前调用了schedule()，其中的switch_to做了关键的进程上下文切换

- 标号1之后开始运行用户态进程Y(这里Y曾经通过以上步骤被切换出去过因此可以从标号1继续执行)

- restore_all //恢复现场

- iret - pop cs:eip/ss:esp/eflags from kernel stack

- 继续运行用户态进程Y

4、源码分析

在schedule()函数中，真正担当起进程切换的函数是context_switch()，在该函数中，它使用了一个很重要的宏函数switch_to()来实现进程切换，原来进程切换说到底关键在switch_to()上，看看swicth_to()的真面目：

#define switch_to(prev,next,last) do { \

unsigned long esi,edi; \

/**

* 在真正执行汇编代码前，已经将prev存入eax，next存入edx中了。

/**

* 保存eflags和ebp到内核栈中。必须保存是因为编译器认为在switch_to结束前，

* 它们的值应当保持不变。

asm volatile("pushfl\n\t" \

"pushl %%ebp\n\t" \

/**

* 把esp的内容保存到prev->thread.esp中

* 这样该字段指向prev内核栈的栈顶。

"movl %%esp,%0\n\t" /* save ESP */ \

/**

* 将next->thread.esp装入到esp.

* 此时，内核开始在next的栈上进行操作。这条指令实际上完成了从prev到next的切换。

* 由于进程描述符的地址和内核栈的地址紧挨着，所以改变内核栈意味着改变当前进程。

"movl %5,%%esp\n\t" /* restore ESP */ \

/**

* 将标记为1f的地址存入prev->thread.eip.

* 当被替换的进程重新恢复执行时，进程执行被标记为1f的那条指令。

"movl $1f,%1\n\t" /* save EIP */ \

/**

* 将next->thread.eip的值保存到next的内核栈中。

* 这样，_switch_to调用ret返回时，就会跳转到next->thread.eip执行。

* 这个地址一般情况下就会是1f.

"pushl %6\n\t" /* restore EIP */ \

/**

* 注意，这里不是用call，是jmp，这样，上一条语句中压入的eip地址就可以执行了。

"jmp __switch_to\n" \

/**

* 到这里，进程A再次获得CPU。它从栈中弹出ebp和eflags。

"1:\t" \

"popl %%ebp\n\t" \

"popfl" \

:"=m" (prev->thread.esp),"=m" (prev->thread.eip), \

/* last被作为输出参数，它的值会由eax赋给它。 */

"=a" (last),"=S" (esi),"=D" (edi) \

:"m" (next->thread.esp),"m" (next->thread.eip), \

"2" (prev), "d" (next)); \

} while (0)

该函数有三个参数，prev指的是当前即将被替换下来的进程，next指的是即将执行的进程,此时为什么需要last，想一想进程切换的过程，假设内核决定将进程A挂起，执行进程B，那么在schedule()函数中，prev就是进程A的描述符地址，next就是进程B的描述符地址，一旦switch_to挂起A，那么进程A就冻结了。后来，当内核想重新执行进程A，它必须通过switch_to宏来挂起进程C（通常不是进程B），此时prev代表C、next代表A。当A恢复执行，它得到它原来的内核堆栈，在这个原来的内核堆栈里，prev代表A，next代表B。此时，代表进程A的内核代码失去了对进程C的引用，就找不到进程C了。

阅读全文

0 0