Februray 25th Sunday (一月 二十五日 日曜日)

来源:互联网 发布:java开发中遇到的难点 编辑:程序博客网 时间:2024/05/21 07:01

  What is the memory layout of a Linux program?  When a program is loaded into memory, each .section is loaded into its own
region of memory.  All of the code and data declared in each section is brought together, even if they were separated in a
source code.

  The actual instruction (the .text section) are loaded at the address 0x8048000.  The .data section is loaded immediately
after that, followed by the .bss section.

  The last byte that can be addressed on Linux is location 0xbfffffff.  Linux starts the stack here and grows it downward
toward the other section.  Between them is a huge gap.  The initial layout of the stack is as follows: At the bottom of the
stack, there is a word of memory that is zero.  After that comes the null-terminated name of the program using ASCII characters.
After the program name comes the program's environment variables.  Then come the program's command-line arguments.

  A program's data region starts at the bottom of memory and goes up.  The stack starts at the top of memory, and moves downward
with each push.  This middle part between the stack and the program's data sections is inaccessible memory - you are not allowed
to access it until you tell the kernel that need it.  If you try, you will get an error -- "segmentation fault".  The same will
happen if you try to access data before the beginning of this program, 0x8048000.  The last accessible memory address to the program
is called the system break (also called the current break or just the break).

0xbfffffff
+--------------------------------+
|  Environment Variables |
+--------------------------------+
|     ...                                     |
+--------------------------------+
|     Arg #2                            |
+--------------------------------+
|     Arg #1                            |
+--------------------------------+
|     Program name            |
+--------------------------------+
|     # of arguments            | %esp
+--------------------------------+
|                                             |
|     Unmapped Memory    |
|                                             |
+--------------------------------+ Break
|    Program Code              |
|      and Data                      |
+--------------------------------+
0x08048000

  Each program gets its own sandbox to play in.  Every program running on your computer thinks that it was loaded at memory address
0x0804800, and that it’s stack starts at 0xbffffff.  When Linux loads a program, it finds a section of unused memory, and then tells
the processor to use that section of memory as the address 0x0804800 for this program.  The address that a program believes it uses
is called the virtual address, while the actual address on the chips that it refers to is called the physical address.  The process
of assigning virtual addresses to physical addresses is called "mapping".

  Why the memory between the .bss and the stack is inaccessible?  The reason is that this region of virtual memory addresses hasn't
been mapped onto physical memory addresses.  The mapping process takes up considerable time and space, so if every possible virtual
address of every possible program were mapped, you would not have enough physical memory to even run one program.  So, the break is
the beginning of the area that contains unmapped memory.  With the stack, however, Linux will automatically map in memory that is
accessed from stack pushes.

  Virtual memory can be mapped to more than just physical memory;  it can be mapped to disk as well.  Swap partitions on Linux allow
Linux’s virtual memory system to map memory not only to physical RAM, but also to disk blocks as well.

  Here is an overview of the way memory accesses are handled under Linux:

  . The program tries to load memory from a virtual address.
  . The processor, using tables supplied by Linux, transforms the virtual memory address into a physical memory address on the fly.
  . If the processor does not have a physical address listed for the memory address, it sends a request to Linux to load it.
  . Linux looks at the address.  If it is mapped to a disk location, it continues on to the next step.  Otherwise, it terminates the
    program with a segmentation fault error.
  . If there is not enough room to load the memory from disk, Linux will move another part of the program or another program onto disk
    to make room.
  . Linux then moves the data into a free physical memory address.
  . Linux updates the processor's virtual-to-physical memory mapping tables to reflect the changes.
  . Linux restores control to the program, causing it to re-issue the instruction which caused this process to happen.
  . The processor can now handle the instruction using the newly-loaded memory and translation tables.

  Now, in order to make the process more efficient, memory is separated out into groups called "pages".  When running Linux on x86
processors, a page is 4096 bytes of memory.  All of the memory mappings are done a page at a time.  Physical memory assignment, swapping,
mapping, etc.  are all done to memory pages instead of individual memory addresses. 

原创粉丝点击