深入理解Linux内核 Memory Addressing Chapter-2

来源：互联网发布：java quartz 编辑：程序博客网时间：2024/06/11 16:16

本章简介：
本书的剩余部分，我们会在本章讲解80x86微处理器地址内存芯片的细节，和Linux如何使用可获得的寻址电路。你将发现，我们希望，当你学会了Linux支持的大多流行平台的实现细节的时候，你将能更好的理解分页的理论如何如何在其他平台上寻找到相应的实现。
（As in the rest of this book, we offer details in this chapter on how 80 x 86 microprocessors address memory chips and how Linux uses the available addressing circuits. You will find, we hope, that when you learn the implementation details on Linux’s most popular platform you will
better understand both the general theory of paging and how to research the implementation on other platforms.）

这是三个与内存管理相关的章节的第一个章节，第八章讨论内核是如何在主存自我分配的，第九章涉及到线性地址是如何分配给进程的。
（This is the first of three chapters related to memory management; Chapter 8 discusses how the kernel allocates（分配） main memory to itself, while Chapter 9 considers how linear addresses are assigned to processes. ）

2.1. Memory Addresses

When dealing with 80 x 86 microprocessors, we have to distinguish three kinds of addresses(当处理80x86微处理器时，我们需要区别三种地址):

Logical address(逻辑地址)
包含在机器语言指令中，用于指定操作数或者指令的地址。逻辑地址包含著名的80x86段结构，段结构是强制MS-DOS和windows编程者将他们的程序分为段。每个逻辑地址由段和偏移组成，偏移表示段的开始到实际地址的距离。
(Included in the machine language instructions to specify the address of an operand or of an instruction. This type of address embodies（包含） the well-known 80 x 86 segmented architecture that forces MS-DOS and Windows programmers to divide their programs into segments（段） . Each logical address consists of a segment and an offset (or displacement) that denotes（表示） the distance from the start of the segment to the actual address.)
Linear address (also known as virtual address)线性地址又称为虚拟地址
32位无符号整数，可以表示最高4Gb地址（0x00000000～0xffffffff）。
(A single 32-bit unsigned integer that can be used to address up to 4 GB that is, up to 4,294,967,296 memory cells. Linear addresses are usually represented in hexadecimal notation(16进制符号); their values range from 0x00000000 to 0xffffffff )
Physical address（物理地址）
用于在内存芯片上寻址内存单元。物理地址是32位或者36位的无符号整数。
(Used to address memory cells in memory chips. They correspond to the electrical signals(电子信号) sent along the address pins of the microprocessor to the memory bus. Physical addresses are represented as 32-bit or 36-bit unsigned integers.)

如下图，MMU中通过两个硬件电路将逻辑地址转换为物理地址：
figure 2-1

多处理器系统中，所有CPU共享同一内存，意味着内存芯片会被多个独立的CPU同时访问。因为read和write操作在RAM芯片上必须线性执行，所以被称为memory arbiter的硬件电路被插入到了bus和RAM 芯片之间。Its role is to grant access to a CPU if the chip is free and to delay it if the chip is busy servicing a request by another processor. Even uniprocessor systems use memory arbiters , because they include specialized processors called DMA controllers that operate
concurrently with the CPU (see the section "Direct Memory Access (DMA)" 在第十三章). In the case of multiprocessor systems, the structure of the arbiter is more complex because it has more input ports.

The dual Pentium, for instance, maintains a two-port arbiter at each chip entrance and requires that the two CPUs exchange synchronization messages before attempting to use the common bus.

从编程角度arbiter是不可见的，其是由硬件电路管理的。
（From the programming point of view, the arbiter is hidden because it is managed by hardware circuits. ）

2-2 Segmentation in Hardware

介绍了硬件中的段的相关知识
链接：http://blog.csdn.net/feather_wch/article/details/50704862

2-3 Segmentation in Linux

段和分页的功能是有些过剩的，因为两者都是将线性地址转换为物理地址空间。Linux更倾向于使用分页，有下列原因：

English text:
Segmentation has been included in 80 x 86 microprocessors to encourage programmers to split their applications into logically related entities, such as subroutines or global and local data areas. However, Linux uses segmentation in a very limited way. In fact, segmentation and paging are somewhat redundant(多余的), because both can be used to separate the physical address spaces of processes: segmentation can assign a different linear address space to each process, while paging can map the same linear address space into different physical address spaces. Linux prefers paging to segmentation for the following reasons:

当所有进程使用同样的1寄存器值，当他们分享同一个线性地址集的时候，内存管理更容易（Memory management is simpler when all processes use the same segment register values that is, when they share the same set of linear addresses.）
Linux设计目标之一就是对于所有体系结构的兼容性。RISC结构尤其限制了对段的支持。（ One of the design objectives of Linux is portability to a wide range of architectures; RISC architectures in particular have limited support for segmentation.）

Linux仅仅在80x86使用段

The 2.6 version of Linux uses segmentation only when required by the 80 x 86 architecture.

所有运行在用户模式的Linux进程使用同一对段用于指令和数据。被称为用户代码段，用户数据段。相应在内核模式，有内核代码段和内核数据段。

下表显示了这四种至关重要的段的段描述符域的值

Segment Base G Limit S Type DPL D/B P user code 0x00000000 1 0xfffff 1 10 3 1 1 user data 0x00000000 1 0xfffff 1 2 3 1 1 kernel code 0x00000000 1 0xfffff 1 10 0 1 1 kernel data 0x00000000 1 0xfffff 1 2 0 1 1

相应的段选择子被宏定义为：_ USER_CS , USER_DS , KERNEL_CS , and _KERNEL_DS
To address the kernel code segment, for instance, the kernel just loads the value yielded by the _ _KERNEL_CS macro(宏) into the cs segmentation register.

在Linux所有开始在0x00000000的段，逻辑地址都是和线性地址一致的。

As stated earlier, the Current Privilege Level of the CPU indicates whether the processor is in User or Kernel Mode and is specified by the RPL field of the Segment Selector stored in the cs register. CPL改变，一些段寄存器也需要更新。例如，CPL = 3（用户模式），ds必须包含用户数据段的段选择子。当CPL = 0， ds寄存器必须包含内核数据段的段选择子。

A similar situation occurs for the ss register. It must refer to a User Mode stack inside the user data segment when the CPL is 3, and it must refer to a Kernel Mode stack inside the kernel data segment when the CPL is 0. When switching from User Mode to Kernel Mode, Linux always makes sure that the ss register contains the Segment Selector of the kernel data segment.

2.3.1. The Linux GDT

单处理器系统只有一个GDT，多处理系统每个CPU有各自的GDT。All GDTs are stored in the cpu_gdt_table array, while the addresses and sizes of the GDTs (用在初始化gtdr寄存器的时候) are stored in the cpu_gdt_descr array. If you look in the Source Code Index, you can see that these symbols are defined in the file arch/i386/kernel/head.S（这些标志定义在该文件） . Every macro, function, and other symbol in this book is listed in the Source Code Index, so you can quickly find it in the source code.（相应的索引在书后）

The layout of the GDTs is shown schematically如下图：
figure 2-6
Each GDT includes 18 segment descriptors and 14 null, unused, or reserved entries. Unused entries are inserted on purpose so that Segment Descriptors usually accessed together are kept in the same 32-byte line of the hardware cache (插入unused实体的作用式访问的段描述符进场保存在hardware cache的同一个32字节line中，请见硬件高速缓存，see the section “Hardware Cache” later in this chapter).

18个GDT中的段描述符有如下几类：

Four user and kernel code and data segments
A Task State Segment (TSS), different for each processor in the system.
- The linear address space corresponding to a TSS is a small subset of the linear address space corresponding to the kernel data segment. The TSSs are sequentially stored in the init_tss array; in particular, TSS描述符的base域表示在init_tss中的第n个（the Base field of the TSS descriptor for the n^th CPU points to the n^th component of the init_tss array.） The G = 0, while the Limit= 0xeb（TSS segment is 236 bytes long）. The Type field is set to 9 or 11 (available 32-bit TSS), and the DPL = 0, TSS只能内核模式使用. You will find details on how Linux uses TSSs in the section”Task State Segment” in Chapter 3.
A segment including the default Local Descriptor Table (LDT), usually shared by all processes
三个TLS段：是一种允许多线程应用充分利用三个包含每个线程局部数据的机制。Three Thread-Local Storage (TLS) segments: this is a mechanism that allows multithreaded applications to make use of up to three segments containing data local to each thread. The set_thread_area( ) and get_thread_area( ) system calls, respectively, create and release a TLS segment for the executing process.
APM的相关段：当APM驱动调用BIOS函数进行得到和设置APM设备状态的时候，其可能用到的代码和数据段（Three segments related to Advanced Power Management (APM ): the BIOS code makes use of segments, so when the Linux APM driver invokes BIOS functions to get or set the status of APM devices, it may use custom code and data segments.）
Linux pnp驱动调用BIOS函数来检测PnP服务使用的资源，其可能调用的数据和代码段（Five segments related to Plug and Play (PnP ) BIOS services. As in the previous case, the BIOS code makes use of segments, so when the Linux PnP driver invokes BIOS functions to detect the resources used by PnP devices, it may use custom code and data segments.）
A special TSS segment used by the kernel to handle “Double fault ” exceptions(see “Exceptions” in Chapter 4).

As stated earlier, there is a copy of the GDTfor each processor in the system. All copies of the GDT store identical entries, except for a few cases.

First, each processor has its own TSS segment, thus the corresponding GDT’s entries differ.
Moreover, a few entries in the GDT may depend on the process that the CPU is executing (LDT and TLS Segment Descriptors).
Finally, in some cases a processor may temporarily modify an entry in its copy of the GDT; this happens, for instance, when invoking an APM's BIOS procedure.

2.3.2. The Linux LDTs

Most Linux User Mode applications do not make use of a Local Descriptor Table, thus the kernel defines a default LDT to be shared by most processes. The default LDT is stored in the default_ldt array.

It includes five entries, but only two of them are effectively used by the kernel: a call gate for iBCS executables, and a call gate for Solaris /x86 executables (see the section “Execution Domains” in Chapter 20).

0 0