PE

来源:互联网 发布:java全文检索 编辑:程序博客网 时间:2024/04/26 21:41

Tutorial 1: Overview of PE file format

PE stands for Portable Executable. It's the native file format of Win32. Its specification is derived somewhat from the Unix Coff (common object file format). The meaning of "portable executable" is that the file format is universal across win32 platform: the PE loader of every win32 platform recognizes and uses this file format even when Windows is running on CPU platforms other than Intel. It doesn't mean your PE executables would be able to port to other CPU platforms without change. Every win32 executable (except VxDs and 16-bit Dlls) uses PE file format. Even NT's kernel mode drivers use PE file format. Thus studying the PE file format gives you valuable insights into the structure of Windows.

DOS MZ header
DOS stub
PE header
Section table
Section 1
Section 2
Section ...
Section n

The above picture is the general layout of a PE file. All PE files (even 32-bit DLLs) must start with a simple DOS MZ header. We usually aren't interested in this structure much. It's provided in the case when the program is run from DOS, so DOS can recognize it as a valid executable and can thus run the DOS stub which is stored next to the MZ header. The DOS stub is actually a valid EXE that is executed in case the operating system doesn't know about PE file format. It can simply display a string like "This program requires Windows" or it can be a full-blown DOS program depending on the intent of the programmer. We are also not very interested in DOS stub: it's usually provided by the assembler/compiler. In most case, it simply uses int 21h, service 9 to print a string saying "This program cannot run in DOS mode".

If we view the PE file format as a logical disk, the PE header as the boot sector and the sections as files, we still don't have enough information to find out where the files reside on the disk, ie. we haven't discussed the directory equivalent of the PE file format. Immediately following the PE header is the section table which is an array of structures. Each structure contains the information about each section in the PE file such as its attribute, the file offset, virtual offset. If there are 5 sections in the PE file, there will be exactly 5 members in this structure array. We can then view the section table as the root directory of the logical disk. Each member of the array is equvalent to the each directory entry in the root directory.

The section table of this file looks like the following:



SectionVirtual SizeVirtual OffsetRaw SizeRaw OffsetCharacteristics.text0000024C00001000000004000000040060000020.rdata000001DC00002000000002000000080040000040.data000000E0000030000000020000000A00C0000040

That's all about the physical layout of the PE file format. I'll summarize the major steps in loading a PE file into memory below:

  1. When the PE file is run, the PE loader examines the DOS MZ header for the offset of the PE header. If found, it skips to the PE header.
  2. The PE loader checks if the PE header is valid. If so, it goes to the end of the PE header.
  3. Immediately following the PE header is the section table. The PE header reads information about the sections and maps those sections into memory using file mapping. It also gives each section the attributes as specified in the section table.
  4. After the PE file is mapped into memory, the PE loader concerns itself with the logical parts of the PE file, such as the import table.


xx
原创粉丝点击