《Understanding the Linux kernel》学习笔记 Chapter 3: Processes

来源：互联网发布：io域名在哪里注册好编辑：程序博客网时间：2024/04/29 19:03

3.1 Processes, Lightweight Processes, and Threads

A process is an instance of a program in execution.

From the kernel's point of view, the purpose of a process is to act as an entity to which system resources (CPU time, memory, etc.) are allocated.

Linux uses lightweight processes to offer better support for multithreaded applications.

A straightforward way to implement multithreaded applications is to associate a lightweight process with each thread.

3.2 Process Descriptor

3.2.1 Process State

As its name implies, the state field of the process descriptor describes what is currently happening to the process. It consists of array of flags, each of which describes a possible process state. The following are the possible process states: TASK_RUNNING, TASK_INTERRUPTIBLE, TASK_UNINTERRUPTIBLE, TASK_STOPPED, TASK_TRACED.

Two additional states of the process can be stored both in thestate field and in theexit_state field of the process descriptor; as the field name suggests, a process reaches one of these two states only when its execution is terminated: EXIT_ZOMBIE, EXIT_DEAD.

3.2.2 Identifying Process

The strict one-to-one correspondence between the process and process descriptor makes the 32-bit address of thetask_struct structure a useful means for the kernel to identify processes. These addresses are referred to as process descriptor pointers.

On the other hand, Unix-like operating systems allow users to identify processes by means of a number called theProcess ID (orPID), which is stored in thepid field of the process descriptor.

Linux associates a different PID with each process or lightweight process in the system.

The identifier shared by the threads is the PID of the thread group leader, that is, the PID of the first lightweight process in the group; it is stored in thetgid field of the process descriptors.

Process descriptors handling

For each process, Linux packs two different data structures in a single per-process memory area: a small data structure linked to the process descriptor, namely thethread_info structure, and the Kernel Mode process stack.

The kernel uses the alloc_thread_info andfree_thread_info macros to allocate and release the memory area storing athread_info structure and a kernel stack.

Identifing the current process

The close association between the thread_info structure and the Kernel Mode stack just described offers a key benefit in terms of efficiency; the kernel can easy obtain the address of thethread_info structure of the process currently running on a CPU from the value of the esp register.

Doubly linked lists

The Linux kernel defines the list_head data structure, whose only fieldsnext andprev represent the forward and back pointers of a generic doubly linked list element, respectively. It is important to note, however, that the pointers in alist_head field store the addresses of otherlist_head fields rather than the addresses of the whole data structures in which thelist_head structure is included.

The process list

Each task_struct structure includes a task field of type list_head whose prev and next fields point, respectively, to the previous and to the nexttask_struct element.

The list of TASK_RUNNING processes

The trick used to achieve the scheduler speedup consists of splitting the runqueue in many lists of runnable processes, one list per process priority. Eachtask_struct descriptor includes arun_list field of typelist_head.

3.2.3 Relationships Among Processes

Processes created by a program have a parent/child relationship. When a process creates multiple children, these children have sibling relationships.

Furthermore, there exist other relationships among processess: a process can be a leader of a process group or of a login session, it can be a leader of a thread group, and it can also trace the the execution of other processes.

3.2.4 How Processes Are Organized

Processes in a TASK_STOPPED, EXIT_ZOMBIE, or EXIT_DEAD state are not linked in specific lists.

Processes in a TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE state are subdivided into many classes, each of which corresponds to a specific event.

Wait queues

Wait queues implement conditional waits on events: a process wishing to wait for a specific event places itself in the proper wait queue and relinquishes control. Therefore, a wait queue represents a set of sleeping processes, which are woken up by the kernel when some condition becomes true.

Wait queues are implmented as doubly linked lists whose elements include pointers to process descriptors.

Handling wait queues

3.2.5 Process Resource Limits

Each process has an associated set of resource limits, which specify the amout of system resources it can use. These limits keep a user from overwhelming the system (its CPU, disk space, and so on).

The resource limits for the current process are stored in thecurrent->signal->rlim field, that is, in a field of the process's signal descriptor.

3.3 Process Switch

3.3.1 Hardware Context

The set of data that must be loaded into the registers before the process resumes its execution on the CPU is called the hardware context. The hardware context is a subset of the process execution context, which includes all information needed for the process execution. In Linux, a part of the hardware context of a process is stored in the process descriptor, while the remaining part is saved in the Kernel Mode stack.

Linux 2.6 uses software to perform a process switch for the following reasons:

Step-by-step switching performed through a sequence ofmov instructions allows better control over the validity of the data being loaded.
The amount of time required by the old approach and the new approach is about the same.

Process switching occurs only in Kernel Mode. The contents of all registers used by a process in User Mode have already been saved on the Kernel Mode stack before performing process switching.

3.3.2 Task State Segment

The 80x86 architecture includes a specific segment type called the Task State Segment (TSS), to store hardware contexts. Although Linux doesn't use hardware context switches, it is nonetheless foreced to set up a TSS for each distinct CPU in the system. This is done for two main reasons:

When an 80x86 CPU switches from User Mode to Kernel Mode, it fectches the address of the Kernel Mode stack from the TSS.
When a User Mode process attempts to access an I/O port by means of anin orout instruction, the CPU may need to access an I/O Permission Bitmap stored in the TSS to verify whether the process is allowed to address the port.

At each process switch, the kernel updates some fields of the TSS so that the corressponding CPU's control unit may safely retrieve the information it needs.

Each TSS has its own 8-byte Task State Segment Descriptor (TSSD).

The TSSDs created by Linux are stored in the Global Descriptor Table (GDT), whose base address is stored in thegdtr register of each CPU.

The thread field

Each process descriptor includes a field called thread of type thread_struct, in which the kernel saves the hardware context whenever the process is being switched out. This data structure includes fields for most of the CPU registers, except the general-purpose register such as eax, ebx, etc., which are stored in the Kernel Mode stack.

3.3.3 Performing the Process Switch

Every process switch consists of two steps:

Switching the Page Global Directory to install a new address.
Switching the Kernel Mode stack and the hardware context, which provides all the information needed by the kernel to execute the new process, including the CPU registers.

3.3.4 Saving and Loading the FPU, MMX, and MMX Registers

3.4 Creating Processes

3.4.1 The clone(), fork(), and vfork() System Calls

3.4.2 Kernel Threads

3.5 Destroying Processes

3.5.1 Process Termination

3.5.2 Process Removal

0 0