在用户空间下解析fork()函数

来源：互联网发布：淘宝如何货到付款编辑：程序博客网时间：2024/05/16 12:41

在用户空间下解析fork()函数

在Linux编程中创建子进程的系统调用（system call）是通过fork()函数实现的，这是Linux环境下创建新进程的唯一途径。

fork()函数在Linux Programmer’s Manual中的介绍如下：

NAME

fork – create a child process

SYNOPSIS

#include <sys/types.h>

#include <unistd.h>

pid_t fork(void);

DESCRIPTION

fork creates a child process that differs from the parent process only in its PID and PPID, and in the fact that resource utilizations are set to 0. File locks and pending signals are not inherited.

Under Linux, fork is implemented using copy-on-write pages, so the only penalty incurred by fork is the time and memory required to duplicate the parent’s page tables, and to create a unique task structure for the child.

RETURN VALUE

On success, the PID of the child process is returned in the parent’s thread of execution, and a 0 is returned in the child’s thread of execution. On failure, a -1 will be returned in the parent’s context, no child process will be created, and errno will be set appropriately.

ERRORS

EAGAIN fork cannot allocate sufficient memory to copy the parent’s page tables and allocate a task

structure for the child.

ENOMEM fork failed to allocate the necessary kernel structures because memory is tight.

下面是为了说明fork()函数的执行过程而写的一段C语言代码示例：

pid_t pid;

if(pid<0)

{

perror(“fork error”);

}

else if(pid==0)

{

printf(“This is child process. ”);

}

else

{

printf(“This is father process. ”);

}

这段代码的执行结果是：

This is child process.

This is father process.

按照一贯的C语言思维，这个执行的结果是非常奇怪的。源程序是一个典型的分支结构，只可能打印出一行字符串才对。按照这个思路，也就走进了死胡同，甚至于曾经在讨论中还有不懂装懂的人对于C语言的分支结构产生怀疑，以为分支结构也另有可能执行所有分支代码的情况（不过也不是没有可能性，那就是自己设计一个让分支结构失效的C编译器J）。为了说明情况的复杂性，得到该段代码的汇编码，并注释如下：

main:

pushl %ebp

movl %esp, %ebp

subl $8, %esp

andl $-16, %esp

movl $0, %eax

subl %eax, %esp

call fork //执行fork系统调用

movl %eax, -4(%ebp)

cmpl $0, -4(%ebp) //比较返回值

jns .L2 //若大于0，跳转到L2，否则继续执行

subl $12, %esp

pushl $.LC0

call perror

addl $16, %esp

jmp .L3 //执行完pid<0的分支，跳转到L3

.L2:

cmpl $0, -4(%ebp)

jne .L4 //若不等于0，跳转到L4，否则继续执行

subl $12, %esp

pushl $.LC1

call printf //打印“This is child process.”

addl $16, %esp

jmp .L3 //执行完pid==0分支，跳转到L3

.L4:

subl $12, %esp

pushl $.LC2

call printf //打印“This is father process.”

addl $16, %esp //执行完pid>0分支，继续执行L3

.L3:

movl $0, %eax

leave //return 0

ret //从主函数中返回

从以上汇编代码可以得出结论，在本程序中分支语句依然是互斥的，在执行完某一分支后程序结束。那么fork()函数到底做了什么使得程序的输出如此的奇怪？

想要理解fork()的执行过程，就必须先要清楚操作系统中“进程”（process）的概念。

A process is just an excuting program, including the current values of the program counter, registers, and variables.

cite: Andrew S.Tanenbaum, Modern Operating Systems(Second Edition), 72

对于“进程”的定义，可以理解为一个程序在一次执行过程中的状态的集合，该集合包含了程序本身，与程序相关的所有数据，还有程序执行的上下文（context）。操作系统通过进程表来管理系统内的进程，在进程表中的每一项对应于一个进程。在单CPU环境下，每个时刻只可能有一个进程占用着CPU资源，而其它进程都处于等待、阻塞等状态。而所谓的“多任务”，则是由于CPU在极短的时间内来回切换于进程间所产生的假象。

当CPU在进程间切换时，操作系统会将前一进程相关的寄存器的内容，保存到该进程在进程表中对应的表项里面，而把将要占用 CPU的那个进程的上下文，从进程表中读出，并更新相应的寄存器，这就是所谓的“上下文切换”过程。当然，实际的上下文切换涉及的数据更多，不过那些数据和fork()无关。

在执行fork()后，操作系统会创建一个新的进程（所谓的“子进程”），在进程表中自然也会建立新的表项。新进程（子进程）和原有的进程（父进程）拥有相同的程序代码、上下文等，因为这些数据对于子进程来说，都是父进程的拷贝而已。子进程与父进程共享着相同的代码空间，拥有相同的程序数据，但是数据空间的地址是独立的（其内容与父进程相同而已），如下示例代码说明。

pid_t pid=fork();

int i=0, j=0;

if(pid<0)

{

perror("fork error");

}

else if(pid==0)

{

j=1;

printf("This is child process. ");

printf(“i=%d,j=%d ”,i,j);

_exit(0);

}

else

{

printf("This is father process. ");

printf(“i=%d,j=%d ”,i,j);

exit(0);

}

程序执行结果为：

This is child process.

i=0,j=1

This is father process.

i=0,j=0

所以在程序段中，fork()成功后，就派生出了两个进程（原有的父进程与刚创建的子进程），这两个进程互相独立（在CPU看来，这两个进程并没有附属关系），步调一致，但是却做着不同的工作，也就由此分道而行。但是在两个进程运行前，并不能决定运行的次序或是结束的次序，这是由操作系统所决定的，所以要想使两进程协同工作，则可以采取另外的算法来实现。

以下示例通过sleep()函数让子进程进入睡眠状态实现父进程的优先：

#define FATHER_FIRST

pid_t pid;

if(pid<0)

{

perror(“fork error”);

}

else if(pid==0)

{

#ifdef FATHER_FIRST

sleep(1);

#endif

printf(“This is child process. ”);

_exit(0);

}

else

{

printf(“This is father process. ”);

exit(0);

}

（1） “#define FATHER_FIRST”后，程序运行结果为：

This is father process.

This is child process.

（2）注释掉“#define FATHER_FIRST”后，程序运行结果为：

This is child process.

This is father process.

所以，已经很明显了，程序输出的两行字符串，是一个程序的两个不同的进程执行的结果，而不是表面上的一个执行过程。

另外，fork()函数对于这两个共享代码的进程的返回值（程序中的变量pid的值）不同，是因为fork()函数返回的是当前进程的子进程号。对于原进程（父进程）来说，变量pid表示的就是其子进程号（pid>0）；对于另一个进程（子进程）而言，并没有相应的由其派生出的子进程，所以返回值为0（pid==0）。这样，就能在程序代码中把两个进程所要执行的工作区别开了。