How Computers Boot Up -- 计算机是如何启动的

来源:互联网 发布:json socket 编辑:程序博客网 时间:2024/05/29 15:09

The previous post described motherboards and the memory map in Intel computers to set the scene for the initial phases of boot. Booting is an involved, hacky, multi-stage affair – fun stuff. Here’s an outline of the process:

前篇文章我们阐述的是Intel PC机主板布局以及内存映射知识,为本篇计算机引导过程奠定了基础。 计算机的引导过程一般比较复杂,软件上使用了很多技巧,并且通常是通过多个阶段协调工作得以完成,总之是个很有趣的过程。 这里有计算机引导过程的简图:



An outline of the boot sequence


Things start rolling when you press the power button on the computer (no! do tell!). Once the motherboard is powered up it initializes its own firmware – the chipset and other tidbits – and tries to get the CPU running. If things fail at this point (e.g., the CPU is busted or missing) then you will likely have a system that looks completely dead except for rotating fans. A few motherboards manage to emit beeps for an absent or faulty CPU, but the zombie-with-fans state is the most common scenario based on my experience. Sometimes USB or other devices can cause this to happen: unplugging allnon-essential devices is a possible cure for a system that was working and suddenly appears dead like this. You can then single out the culprit device by elimination.

当我们按下计算机的电源按钮,计算机就开始运转了(当然,现在别按^_^)。 一旦主板上电后,它就初始化自身的固件,初始化芯片组和其他组件,并且尝试启动CPU。 如果在这个阶段失败了(也就是说没有检测到CPU或者CPU损坏了),那么计算机除了风扇在转动外,整个系统是完全没有响应的。 一些主板会因为没有检测到CPU或是发现CPU有故障时发出蜂鸣声,以示警告,但是以我的经验,大部分计算机都成僵死状态,并无警报声。 有的时候,USB设备或者其它相关设备也会引起计算机的这种僵死状态:当计算机工作得好好的,突然呈现这种僵死状态时,可以尝试拔掉所有的非必要设备,也许可以解决问题。 你也可以一一拔掉这些设备,从而找到引起计算机故障的设备。


If all is well the CPU starts running. In a multi-processor or multi-core system one CPU is dynamically chosen to be the bootstrap processor (BSP) that runs all of the BIOS and kernel initialization code. The remaining processors, called application processors (AP) at this point, remain halted until later on when they are explicitly activated by the kernel. Intel CPUs have been evolving over the years but they’re fully backwards compatible, so modern CPUs can behave like the original 1978 Intel 8086, which is exactly what they do after power up. In this primitive power up state the processor is in real mode with memory paging disabled. This is like ancient MS-DOS where only 1 MB of memory can be addressed and any code can write to any place in memory – there’s no notion of protection or privilege.

如果一切正常,CPU就开始运行了。在一个多处理器或者多核系统中,会动态地选择一个CPU作为自引导处理器(BSP)去运行BIOS代码以及内核初始化代码。 其它的CPU,也被称为应用处理器(AP),依然处于停止状态,直到内核启动后显示地激活它们。 Intel CPU尽管已经发展多年了,但是它们完全是向前兼容的,所以现代的CPU在机器上电之后所做的工作依然可以像1987年的8086处理器一样。 在上电启动后,CPU处于实模式模式。并且分页功能是禁止的。 这有点像曾经的MS-DOS系统,只能访问1M的物理内存,并且程序可以读写内存的任何地方 -- 可见那时根本没有保护和优先级的概念。


Most registers in the CPU have well-defined values after power up, including the instruction pointer (EIP) which holds the memory address for the instruction being executed by the CPU. Intel CPUs use a hack whereby even though only 1MB of memory can be addressed at power up, a hidden base address (an offset, essentially) is applied to EIP so that the first instruction executed is at address 0xFFFFFFF0 (16 bytes short of the end of 4 gigs of memory and well above one megabyte). This magical address is called the reset vector and is standard for modern Intel CPUs.

在计算机上电之后,CPU内部的寄存器都被初始化为相应的值,包括指令指针寄存器(EIP),CPU就是依据该寄存器存储的地址值来运行指令的。 上电后,尽管CPU处在实模式,只能访问1M的物理内存,但是执行的第一条指令地址却在0xFFFFFFF0处(离4G内存末端仅16字节,远超过1M内存范围),这是因为CS段寄存器的段描述符缓冲部分中的基地址在上电后或者重启后的初始值是0xFFFF0000,EIP被初始化为0xFFF0,两个值相加就得到了0xFFFFFFF0这个地址。 这个特殊的地址被称作复位向量,并且已经成为现代CPU的标准。


The motherboard ensures that the instruction at the reset vector is a jump to the memory location mapped to the BIOS entry point. This jump implicitly clears the hidden base address present at power up. All of these memory locations have the right contents needed by the CPU thanks to the memory map kept by the chipset. They are all mapped to flash memory containing the BIOS since at this point the RAM modules have random crap in them. An example of the relevant memory regions is shown below:

主板可以确保复位向量中保存的是一个跳转指令,该指令跳转到BIOS执行入口点所在的内存映射地址。 在该跳转指令执行的同时也会隐式地清除CS段寄存器中隐藏的基地址(这个基地址是在上电阶段初始化的)。 当然,也多亏了芯片组中的内存映射表,才使得这些内存地址处保存着CPU期望的内容。 这些地址都被映射到BIOS所在的flash内存中,这是因为在这个阶段,真实的RAM模块中都是些随机垃圾值。 下图展现的是相关的内存区域:

Important memory regions during boot


The CPU then starts executing BIOS code, which initializes some of the hardware in the machine. Afterwards the BIOS kicks off the Power-on Self Test (POST) which tests various components in the computer. Lack of a working video card fails the POST and causes the BIOS to halt and emit beeps to let you know what’s wrong, since messages on the screen aren’t an option. A working video card takes us to a stage where the computer looks alive: manufacturer logos are printed, memory starts to be tested, angels blare their horns. Other POST failures, like a missing keyboard, lead to halts with an error message on the screen. The POST involves a mixture of testing and initialization, including sorting out all the resources – interrupts, memory ranges, I/O ports – for PCI devices. Modern BIOSes that follow the Advanced Configuration and Power Interface build a number of data tables that describe the devices in the computer; these tables are later used by the kernel.

紧接着,CPU开始执行BIOS代码,初始化机器中的一些硬件。 之后BIOS开始执行开机自检程序,用来检测计算机中的各部分组件。 如果检测到没有显卡的话,那么BIOS指令就会停止运行,并且发出蜂鸣声告诉我们出错了,对没有显卡这样的错误是无法容忍的,因为在显示器上显示信息是必须的。因为如果显卡存在并且可以正常工作,我们就可以轻易地根据显示在显示器上的信息知道计算机的活动:打印生产厂商的商标图案,显示正在检测内存等等信息。 其它的检测错误,譬如没有检测到键盘,也会导致计算机停止运行,并且在显示器上打印相应的错误信息。 开机自检是对计算机复杂的检测和初始化的过程,其中也包括为各种PCI设备设置系统资源 -- 设置中断号,分配内存区域,以及设置IO端口号。 现如今的BIOS都遵循高级配置与电源接口协议(ACPI)创建描述设备的数据表格,这些表格数据会被之后启动的内核使用。


After the POST the BIOS wants to boot up an operating system, which must be found somewhere: hard drives, CD-ROM drives, floppy disks, etc. The actual order in which the BIOS seeks a boot device is user configurable. If there is no suitable boot device the BIOS halts with a complaint like “Non-System Disk or Disk Error.” A dead hard drive might present with this symptom. Hopefully this doesn’t happen and the BIOS finds a working disk allowing the boot to proceed.

自检完并且一切正常之后,BIOS就可以引导一个操作系统了,当然该操作系统应该存在于某个存储介质上:硬盘,CD光盘,软盘等等。 用户是可以设置BIOS寻找引导设备的顺序的。 当没有检测到可用的引导设备时,BIOS就会停止运行,并且向用户抱怨“没有系统引导设备或者引导设备损坏”。 譬如,当硬盘出现故障时,就会导致此类错误。 当一切顺利,BIOS会找到相应的引导设备,继续运行。


The BIOS now reads the first 512-byte sector (sector zero) of the hard disk. This is called the Master Boot Record and it normally contains two vital components: a tiny OS-specific bootstrapping program at the start of the MBR followed by a partition table for the disk. The BIOS however does not care about any of this: it simply loads the contents of the MBR into memory location 0x7c00 and jumps to that location to start executing whatever code is in the MBR.

现在BIOS读取硬盘的第一扇区,大小512字节。 此扇区也被称为主引导扇区,一般由两个关键的部分组成:开始是一个操作系统的自举程序,紧接着该程序是该硬盘的分区表。 BIOS是不管主引导扇区里是什么数据的,它仅仅要做的是加载主引导扇区的数据到内存的0x7c00地址处,接着跳转到该地址运行引导扇区上的指令。


Master Boot Record

(译者注: 其实扇区是从1开始编号的)


The specific code in the MBR could be a Windows MBR loader, code from Linux loaders such as LILO or GRUB, or even a virus. In contrast the partition table is standardized: it is a 64-byte area with four 16-byte entries describing how the disk has been divided up (so you can run multiple operating systems or have separate volumes in the same disk). Traditionally Microsoft MBR code takes a look at the partition table, finds the (only) partition marked as active, loads the boot sector for thatpartition, and runs that code. The boot sector is the first sector of a partition, as opposed to the first sector for the whole disk. If something is wrong with the partition table you would get messages like “Invalid Partition Table” or “Missing Operating System.” This message does not come from the BIOS but rather from the MBR code loaded from disk. Thus the specific message depends on the MBR flavor.

MBR中的代码可以是Windows的引导装载程序,也可以是linux的引导加载程序(譬如我们熟知的LILO或者GRUB),甚至也可能是个病毒程序。 相反,分区表的内容却是标准不变的:64字节被均分为4项,用来记录硬盘的分区情况(因此一个硬盘拥有多个卷标,并且在一个硬盘上可以安装多个操作系统)。 传统Windows的MBR代码会读取分区表信息,找到系统中唯一的激活的主分区,加载该分区的引导扇区代码,并执行其中的代码。 引导扇区是一个分区的第一块扇区,而不一定就是硬盘的第一块扇区。 如果系统检测到分区表发生错误,你就会得到诸如“无效的分区表”或者“丢失操作系统”类似的警告信息。 注意此类警告信息并不是BIOS打印的,而是来自MBR中的代码。 因此这些信息完全依赖与MBR中的代码内容。


Boot loading has gotten more sophisticated and flexible over time. The Linux boot loaders Lilo and GRUB can handle a wide variety of operating systems, file systems, and boot configurations. Their MBR code does not necessarily follow the “boot the active partition” approach described above. But functionally the process goes like this:

随着时间的推移,引导装载过程已经变得越来越复杂了,并且也越来越灵活了。 Linux的引导装载程序LILO和GRUB已经可以引导加载很多不同的操作系统,识别各式各样的文件系统,并且是可以配置的。它们并不需要像上面描述的那样,从激活分区中加载引导扇区代码,它们的工作流程大致如下:


  • The MBR itself contains the first stage of the boot loader. GRUB calls this stage 1.
  • MBR中包含的是引导加载程序的第一阶段代码。 GRUB称此为阶段1。

  • Due to its tiny size, the code in the MBR does just enough to load another sector from disk that contains additional boostrap code. This sector might be the boot sector for a partition, but could also be a sector that was hard-coded into the MBR code when the MBR was installed.
  • 由于MBR大小的限制,第一阶段的代码的工作仅仅是加载另一个扇区上的其余引导代码。 这个扇区可能是一个分区的引导扇区,也可能是生成MBR代码时硬编码在其中的一个扇区编号。

  • The MBR code plus code loaded in step 2 then read a file containing the second stage of the boot loader. In GRUB this is GRUB Stage 2, and in Windows Server this is c:\NTLDR. If step 2 fails in Windows you’d get a message like “NTLDR is missing”. The stage 2 code then reads a boot configuration file (e.g., grub.conf in GRUB, boot.ini in Windows). It then presents boot choices to the user or simply goes ahead in a single-boot system.
  • MBR中的代码配合第2部加载的代码去读取一个文件,这个文件中包含的就是第二阶段的代码。在GRUB中称之为阶段2,在Windows Server中既是文件c:\NTLDR。 如果这个阶段发生错误的话,你会得到诸如“NRLDR is missing”的错误提示。 第二阶段的代码进一步会去读取系统中的一个配置文件(GRUB读取的是grub.conf文件,在Windows中是文件boot.ini)。 之后要么给用户显示一些引导选项(多个操作系统),要么直接启动系统(单操作系统)。

  • At this point the boot loader code needs to fire up a kernel. It must know enough about file systems to read the kernel from the boot partition. In Linux this means reading a file like “vmlinuz-2.6.22-14-server” containing the kernel, loading the file into memory and jumping to the kernel bootstrap code. In Windows Server 2003 some of the kernel start-up code is separate from the kernel image itself and is actually embedded into NTLDR. After performing several initializations, NTDLR loads the kernel image from file c:\Windows\System32\ntoskrnl.exe and, just as GRUB does, jumps to the kernel entry point.
  • 到了这个阶段,引导加载程序就可以启动内核了。在此之前,引导加载程序必须能够识别文件系统,并且准确地定位到存在于引导分区上的内核文件。 在linux系统中这个包含内核代码的文件就是“vmlinuz-2.6.22-14-server”,引导加载程序要做的就是将该文件加载进内核,然后跳转进内核去执行内核的引导代码。 在Windows Server 2003,一些内核启动代码是与内核代码分开的,而是集成在NTLDR中。经过一系列的初始化工作后,NTDLR将c:\Windows\System32\ntoskrnl.exe内核文件加载进内存,然后就跟GRUB做的一样,跳转进内核开始执行内核代码。

There’s a complication worth mentioning (aka, I told you this thing is hacky). The image for a current Linux kernel, even compressed, does not fit into the 640K of RAM available in real mode. My vanilla Ubuntu kernel is 1.7 MB compressed. Yet the boot loader must run in real mode in order to call the BIOS routines for reading from the disk, since the kernel is clearly not available at that point. The solution is the venerable unreal mode. This is not a true processor mode (I wish the engineers at Intel were allowed to have fun like that), but rather a technique where a program switches back and forth between real mode and protected mode in order to access memory above 1MB while still using the BIOS. If you read GRUB source code, you’ll see these transitions all over the place (look under stage2/ for calls to real_to_prot and prot_to_real). At the end of this sticky process the loader has stuffed the kernel in memory, by hook or by crook, but it leaves the processor in real mode when it’s done.

最后值得一提的是,现在的linux内核即便经过了压缩处理,大小也会超过640K。 我的vanilla Ubuntu经过压缩后的内核大小都足足有1.7M。 然而如今的引导加载程序必须运行在实模式下,这是因为它必须借助BIOS程序来读取磁盘,所以此时内核代码是完全没法用的。 解决的办法是通过利用“unreal mode”的特性。它并非一个真正的处理器运行模式(希望Intel的工程师允许我这么说,自娱自乐呵),不过却允许程序在实模式和保护模式之间来回切换,这样就可以访问超过1M的物理内存,并且依然可以使用BIOS程序。 如果你阅读GRUB的源代码,你就会发现这种切换到处都是(看看stage2/目录下的程序,对real_to_prot和prot_to_real函数的调用)。 经过这个复杂的过程,引导加载程序终于将内核全部加载进内存,最后CPU仍处在实模式运行模式。


We’re now at the jump from “Boot Loader” to “Early Kernel Initialization” as shown in the first diagram. That’s when things heat up as the kernel starts to unfold and set things in motion. The next post will be a guided tour through the Linux Kernel initialization with links to sources at the Linux Cross Reference. I can’t do the same for Windows ;) but I’ll point out the highlights.

从引导加载程序跳转到内核中,内核就会进行一系列初始化操作。 下篇文章我将结合Linux Cross Reference探讨下内核的初始化过程。 对于windows的初始化过程,我会把要点指出来。


参考链接:http://blog.csdn.net/drshenlei/article/details/4250306

0 0
原创粉丝点击