APUE 8.3 fork function

来源：互联网发布：压缩文件加密破解软件编辑：程序博客网时间：2024/06/07 10:26

The new process created by fork is called the child process. This function is called once but returns twice. The only difference in the returns is that the return value in the child is 0, whereas the return value in the parent is the process ID of the new child. The reason the child's process ID is returned to the parent is that a process can have more than one child, and there is no function that allows a process to obtain the process IDs of its children. The reason fork returns 0 to the child is that a process can have only a single parent, and the child can always call getppid to obtain the process ID of its parent. (Process ID 0 is reserved for use by the kernel, so it's not possible for 0 to be the process ID of a child.)

Both the child and the parent continue executing with the instruction that follows the call to fork. The child is a copy of the parent. For example, the child gets a copy of the parent's data space, heap, and stack. Note that this is a copy for the child; the parent and the child do not share these portions of memory. The parent and the child share the text segment (Section 7.6).

Current implementations don't perform a complete copy of the parent's data, stack, and heap, since a fork is often followed by an exec. Instead, a technique called copy-on-write (COW) is used. These regions are shared by the parent and the child and have their protection changed by the kernel to read-only. If either process tries to modify these regions, the kernel then makes a copy of that piece of memory only, typically a "page" in a virtual memory system. Section 9.2 of Bach [1986] and Sections 5.6 and 5.7 of McKusick et al. [1996] provide more detail on this feature.

The program in Figure 8.1 demonstrates the fork function, showing how changes to variables in a child process do not affect the value of the variables in the parent process.If we execute this program, we get

$ ./a.outa write to stdoutbefore forkpid = 430, glob = 7, var = 89      child's variables were changedpid = 429, glob = 6, var = 88      parent's copy was not changed$ ./a.out > temp.out$ cat temp.outa write to stdoutbefore forkpid = 432, glob = 7, var = 89before forkpid = 431, glob = 6, var = 88

#include "apue.h"int     glob = 6;       /* external variable in initialized data */char    buf[] = "a write to stdout\n";intmain(void){    int       var;      /* automatic variable on the stack */    pid_t     pid;    var = 88;    if (write(STDOUT_FILENO, buf, sizeof(buf)-1) != sizeof(buf)-1)        err_sys("write error");    printf("before fork\n");    /* we don't flush stdout */    if ((pid = fork()) < 0) {        err_sys("fork error");    } else if (pid == 0) {      /* child */        glob++;                 /* modify variables */        var++;    } else {        sleep(2);               /* parent */    }    printf("pid = %d, glob = %d, var = %d\n", getpid(), glob, var);    exit(0);}

In general, we never know whether the child starts executing before the parent or vice versa. This depends on the scheduling algorithm used by the kernel. If it's required that the child and parent synchronize, some form of interprocess communication is required. In the program shown in Figure 8.1, we simply have the parent put itself to sleep for 2 seconds, to let the child execute. There is no guarantee that this is adequate, and we talk about this and other types of synchronization in Section 8.9 when we discuss race conditions. In Section 10.16, we show how to use signals to synchronize a parent and a child after a fork.

When we write to standard output, we subtract 1 from the size of buf to avoid writing the terminating null byte.Although strlen will calculate the length of a string not including the terminating null byte, sizeof calculates the size of the buffer, which does include the terminating null byte. Another difference is that using strlen requires a function call, whereas sizeof calculates the buffer length at compile time, as the buffer is initialized with a known string, and its size is fixed.

Note the interaction of fork with the I/O functions in the program in Figure 8.1. Recall from Chapter 3 that the write function is not buffered. Because write is called before the fork, its data is written once to standard output. The standard I/O library, however, is buffered. Recall from Section 5.12 that standard output is line buffered if it's connected to a terminal device; otherwise, it's fully buffered. When we run the program interactively, we get only a single copy of the printf line, because the standard output buffer is flushed by the newline. But when we redirect standard output to a file, we get two copies of the printf line. In this second case, theprintf before the fork is called once, but the line remains in the buffer when fork is called. This buffer is then copied into the child when the parent's data space is copied to the child. Both the parent and the child now have a standard I/O buffer with this line in it. The second printf, right before the exit, just appends its data to the existing buffer. When each process terminates, its copy of the buffer is finally flushed.