一步步写操作系统（一） BOOT启动

来源：互联网发布：夜景卫星图知乎编辑：程序博客网时间：2024/06/06 01:59

一步步写操作系统（一）
0.介绍
很久之前就有了写一个操作系统的想法，参考了很多书籍，主要包括《自己动手写操作系统》(余渊)和《30天自制操作系统》(川合秀实)，总体感觉就是，川合秀实版本的比较注重界面描绘与优化，介绍了内存分配和时钟函数等，但是对于操作系统底层并没有做太多介绍，使得读者知其然而不知其所以然。余渊版本的就比较细致，介绍了GDT,LDT,IDT等等，但是最终的成品也可以看到，使用的是80*25显示模式来进行演示的。我想参考这两本书，来一个最简单的学习记录。尤其对于任务切换，余渊版本花了大篇幅来介绍和使用，使人看得云里雾里，但是川合秀实版本利用Intel的TSS结构简单的进行了切换，虽然丢弃了移植性，但是可以让读者在最快时间来了解系统，而不是深究系统特性。下面结合这两本书，以及其他一些网络、书籍资料，对操作系统做一个简单的剖析。
1.Boot启动
Boot阶段可以参考《Writing a Simple Operating System — from Scratch》(Nick Blundell ，网络资料)，虽然最后烂尾了（:-D），但是Boot阶段的分析还是写得比较详细，也是我见过的Boot资料中最好的一份了。当然，如果了解了Boot以及GCC编译的一些知识后，对余渊版本的Boot进行一些改进也可以启动，并且运行内核(后面简称Orange's Boot)。川合秀实版本由于使用了自己写的编译器，所以其Boot的可移植性很小。
在Boot之前，要先了解计算机是怎么启动的。首先CPU加电复位，执行BIOS ROM的命令，一般是硬件初始化等，并同时检测硬盘、磁盘、CD或者外部存储，确定某个存储中包含启动代码，将启动代码复制到内存特定位置，初始化CS，DS，SP并将PC指向启动代码，进入启动。
Boot的编写，就是为了制作一个包含启动代码的存储设备。存储设备都是以扇区来分割的，所以如果某个启动设备的第一个扇区最后两个字节为0xAA55, 那么这个设备的第一个扇区就是启动代码。

可以参照Nick Blundell版本的代码，写一个基本的启动代码（局部文件）：

[org  0x7c00][bits 16]KERNEL_OFFSET     equ    0x9000    ; this is the memory offset to which we will load our kernel    mov    ax, cs    mov    ds, ax    mov    ss, ax    mov    es, ax    mov    fs, ax    mov    gs, ax    mov    [BOOT_DRIVE], dl        ; BIOS stores our boot driver in dl, so it's best to remember                                ; this for later    mov    bp, 0x9000                ; set the stack    mov    sp, bp    mov    si, MSG_REAL_MODE        ; announce that we are starting    call print_string            ; booting from 16-bit real mode    ;call vga_start                ; start VGA modl    call load_kernel            ; load our kernel    call switch_to_pm            ;note that we never return from here    jmp $

从这段代码可以看出很清晰的启动顺序：首先设置段寄存器，然后将kernel加载到指定位置，设置GDT然后切换到保护模式，在保护模式中最后跳转到kernel。一个更高级的启动顺序应该是：设置段寄存器，寻找存储器中的loader并加载运行，在loader寻找存储器中的kernel代码并加载，设置GDT切换到保护模式，在保护模式中跳转到kernel。这个启动顺序是余渊版本的Boot，可以在上面的Nick Blundell 的版本上添加代码，或者直接一直余渊版本的Boot都可以达到要求。但对于希望了解系统的人来说，使用Nick Blundell 版本的简单启动代码应该就足够了。

我对于GDT的理解是，主要是寻址模式的变化。实模式下使用任何寻址方式，都是转化成ds:offset或者cs:offset，在保护模式下也是一样，但是不一样的地方只有ds或者cs代表的意义。实模式下代表直接的基地址，即ds * 16或cs * 16所在的偏移为offset的地址，而保护模式下表示ds或者cs指向的GDT中的某一个GDT定义的基地址，即表示地址GDTs[ds].base + offset，中间多了一层转化，但是实际寻址并没有任何改变，况且Intel也不可能为保护模式进行任何更大的改变（:-D）。
至于这里的switch_to_pm，一开始我自己也比较不放心，总是想这里如果不好好设置是不是系统不能完美运行啊？但是看了两个版本的kernel代码，发现在进入kernel时初期都会再设置GDT，并且在以后的使用中一般不在去改变GDT，才知道这个担心是多余的。switch_to_pm只需要设置两个GDT，一个代码段，一个数据段。Boot阶段的GDT设置只要确定代码段和数据段能够访问到kernel加载到内存的代码就行了。因为代码段和数据段的访问、执行、修改权限不一样，所以要分开定义，否则就会出现随便给一个代码段的地址然后修改了代码却浑然不知的情况。但是由于段是可以交叠的，所以同一个段既可以是代码段又可以是数据段，那么使用数据段修改了代码也就不足为奇。所以GDT的设置并不需要Boot阶段来操心，然而kernel中，最好将数据段的基地址指定到内核以外，以防止修改内核代码，一般从0xFFF（4M）地址以外开始数据段，而代码段从0x0开始，kernel就是从0x0开始加载。
kernel如果从0x0开始加载，有一个很大的好处就是方便调试，比如使用disasm反汇编kernel时，反汇编结果的地址就是内核在内存中的地址。然而这种加载方式只能依靠loader来解决，因为加载内核操作会用到内部中断服务程序(ISR，通过int调用)，这些程序刚好是在低地址处，kernel覆盖会导致kernel不能完全加载。通过Orange's Boot或者Linux Boot的了解可以看到，boot加载loader到0x1000，留出4M内存用来加载kernel，在loader中却又先将kernel加载到高内存中（1M以外），这样可以防止kernel覆盖ISR，最后确定没有int操作以后，使用movsb将高内存的kernel加载到0x0。
另外，如果是使用川合秀实的版本kernel，需要将vga_start的注释解开，这样以便跳到vga模式。并且vga_start中已经保存了vga模式的信息，以便在kernel中使用。在余渊版本中，还另外定义了显示段，即基地址是0xb800的GDT，我觉得没有必要，显示时使用0x0的数据段基地址，将offset调成0xb8000也是一样的用法。另外，在这个版本中GDT使用DA_和SDA_前缀定义了四个两类GDT，分别是系统数据、代码段和用户数据代码段，按该书的说法，是为了测试后面的exception调用门和特权级转换而特别设置的，这样当用户代码段调用系统代码段可以提升特权级，而用户数据段访问系统数据段会出现Exception，这里可以完全忽略这个设置，不用去考虑冗杂的SD_和SDA_前缀，而只要记住数据段和代码段这两个GDT就行了。
最后boot的全部代码如下：

boot.asm

; boot.asm; author stophin;; a boot sector that boots a c kernel in 32-bit protected mode[org  0x7c00][bits 16]KERNEL_OFFSET    equ    0x9000        ; this is the memory offset to which we will load our kernel    mov    ax, cs    mov    ds, ax    mov    ss, ax    mov    es, ax    mov    fs, ax    mov    gs, ax        mov    [BOOT_DRIVE], dl        ; BIOS stores our boot driver in dl, so it's best to remember                                ; this for later    mov    bp, 0x9000                ; set the stack    mov    sp, bp        mov    si, MSG_REAL_MODE        ; announce that we are starting    call print_string            ; booting from 16-bit real mode        ;call vga_start                ; start VGA mode    call load_kernel            ; load our kernel        call switch_to_pm            ;note that we never return from here        jmp $    ; include our useful, hard-earned routine%include "print_string.asm"%include "disk_load.asm"%include "gdt.asm"%include "print_string_pm.asm"%include "switch_to_pm.asm"%include "vga_start.asm"[bits 16]; load kernelload_kernel:    mov    si, MSG_LOAD_KERNEL        ; print a message to say we are loading the kernel    call print_string        mov    bx, KERNEL_OFFSET        ; set up parameters for our disk_load routine, so    mov    dh, 56                    ; that we load the first n sectors (excluding    mov    dl, [BOOT_DRIVE]        ; the boot sector) from the boot disk (i.e our    call disk_load                ; kernel code) to address KERNEL_OFFSET    ret[bits 32]; this is where we arrive after switching to and initialising protected mode.BEGIN_PM:    mov    ebx, MSG_PROTECT_MODE    call print_string_pm        ; use out 32-bit print routine.        call KERNEL_OFFSET            ; now jump to the address of our loaded                                ; kernel code, assume the brace position,                                ; and cross you finger, here we go!        jmp $                        ; Hang.    ; global variablesBOOT_DRIVE            db    0MSG_LOAD_KERNEL        db    "Loading kernel into memory", 0MSG_REAL_MODE        db    "Started in 16-bit Real Mode", 0MSG_PROTECT_MODE    db    "Successfully landed in 32-bit Protected Mode", 0; bootsector paddingtimes    510 - ( $ - $$)    db    0dw    0xaa55

print_string.asm

; print_strin.asm; author stophin;[bits 16]; print_string(SI)print_string:    mov ax, [si]    mov bp, ax    mov cx, 36    mov ax, 01301h    mov bx, 000ch    mov dl, 0    int 10h    ret

print_string_pm.asm

; print_string_pm.asm; author stophin;[bits 32]; define some constantsVIDEO_MEMORY    equ    0xb8000WHITE_ON_BLACK    equ    0x0f; prints a null-terminated string pointed to by EDXprint_string_pm:    pusha    mov    edx, VIDEO_MEMORY        ; set edx to the start of vid memprint_string_pm_loop:    mov    al, [ebx]                ; store the char at EBX in AL    mov    ah, WHITE_ON_BLACK        ; store the attributes in AH        cmp    al, 0                    ; if (al == 0), at end of string, so    je    print_string_pm_done    ; jump to done        mov    [edx], ax                ; store char and attributes at current                                ; character cell    add    ebx, 1                    ; increment EBX to the next char in string    add    edx, 2                    ; move to next character cell in vid mem        jmp print_string_pm_loop    ; loop around to print the next char    print_string_pm_done:    popa    ret                            ; return from the function

disk_load.asm

; disk_load.asm; author stophin;[bits 16]; load dh sectors to ES:BX from drive dldisk_load:    push dx                ; store dx on stack so later we can recall                        ; how many sectors we request to be read,                        ; even if it is altered in the meantime                            mov    ah, 0x02        ; BIOS read sector function    mov    al, dh            ; read dh sectors    mov    ch, 0x00        ; select cylinder 0    mov    dh, 0x00        ; select head 0    mov    cl, 0x02        ; start reading from second sector (i.e.                        ; after the boot sector)    int    0x13            ; BIOS interrupt        jc disk_error        ; jump if error (i.e. carry flag set)        pop dx                ; read dx from the stack    cmp    dh, al            ; if al (sectors read) != dh (sectors expected)    jne    disk_error        ; display error message        ret    disk_error:    mov    bx, DISK_ERROR_MSG    call print_string    jmp $    ; variablesDISK_ERROR_MSG:    db    "Disk read error!", 0

gdt.asm

; gdt.asm; author stophin;[bits 16]; global descriptor tablegdt_start:gdt_null:            ; the mandatory null discriptor    dd    0x0            ; 'dd' means define double word (i.e. 4 bytes)    dd    0x0    gdt_code:            ; the code segment descriptor    ; base 0x0, limit 0xfffff    ; 1st flags: (present) 1 (pricilege) 00 (descriptor type) 1 -> 1001b    ; type flags : (code) 1 (confroming) 0 (readable) 1 (accessed) 0 -> 1010b    ; 2nd flags : (granularity) 1 (32-bit default) 1 (64-bit seg) 0 (AVL) 0 ->1100b    dw    0xffff        ; limit (bits 0 - 15)    dw    0x0            ; base (bits 0 - 15)    db    0x0            ; base (bits 16 - 23)    db    10011010b    ; 1st flags, type flags    db    11001111b    ; 2nd flags, limit (bits 16 - 19)    db    0x0            ; base (bits 24 - 31)    gdt_data:            ; the data segment descriptor    ; same as code segment except for the type flags    ; type flags : (code) 0 (expand down) 0 (writable) 1 (accessed) 0 -> 0010b    dw    0xffff        ; limit (bits 0 - 15)    dw    0x0            ; base (bits 0 - 15)    db    0x0            ; base (bits 16 - 23)    db    10010010b    ; 1st flags, type flags    db    11001111b    ; 2nd flags, limit (bits 16 - 19)    db    0x0            ; base (bits 24 - 31)    gdt_end:            ; the reason for putting a label at the end of the                    ; GDT is so we can have the assembler calculate                    ; the size of the GDT for the gdt descriptor (below)                    ; GDT descriptorgdt_descriptor:    dw    gdt_end - gdt_start - 1        ; size of our GDT, always less one    dd    gdt_start                    ; start address of our GDT    ; deing some handy constants for the GDT segment descriptor offsets, which; are what segment registers must contain when in protected mode. For example,; when we set DS = 0x10 in PM, the CPU knows that we mean it to use the; segment described at offset 0x10 (i.e. 16 bytes) in out GDT, which in our; case is the DATA segment (0x0 -> NULL; 0x08 -> CODE; 0x10 ->DATA)CODE_SEG    equ    gdt_code - gdt_startDATA_SEG    equ    gdt_data - gdt_start

vga_start.asm

; vga_start.asm; author stophin;[bits 16]; remember VGA info; note the start address will be used in kernelVMODE    equ    0x0ff0    ; VGA modeSCRNX    equ    0x0ff2    ; screen XSCRNY    equ    0x0ff4    ; screen YVRAM    equ    0x0ff8    ; memory cache; vga startvga_start:    mov    al, 0x13    ; VGA card, 320*200*8bit                    ; other:                    ; 0x03: 16bit character 80 * 25, initial mode                    ; 0x12: VGA card, 640*480*4bit                    ; 0x6a: extended VGA card, 800*600*4    mov    ah, 0x00    int    0x10        mov    byte [VMODE], 8    mov    word [SCRNX], 320    mov    word [SCRNY], 200    mov    dword [VRAM], 0xa0000        ret

switch_to_pm.asm

; switch_to_pm.asm; author stophin;[bits 16]; switch to protected modeswitch_to_pm:    cli                        ; we must switch off interrupts until we have                            ; setup the protected mode interrupt vector                            ; otherwise interrupts will run riot                        lgdt [gdt_descriptor]    ; load out global descriptor table, which defines                            ; the protected mode segments (e.g. for code and data)                                mov    eax, cr0            ; to make the switch to protected mode, we set    or    eax, 0x1            ; the first bit of CR0, a control register    mov    cr0, eax        jmp    CODE_SEG:init_pm    ; make a far jump (i.e. to a new segment) to our 32-bit                            ; code. This also forces the CPU to flush its cache of                            ; pre-fetched and real-mode decoded instructions, which                            ; cause problems                            [bits 32]; initialise registers and the stack once in PM.init_pm:    mov    ax, DATA_SEG        ; now in PM, out old segments are meaningless.    mov    ds, ax                ; so we point out segment registers to the    mov    ss, ax                ; data selector we defined in our GDT    mov    es, ax    mov    fs, ax    mov    gs, ax        mov    ebp, 0x090000        ; update our stack position so it is right    mov    esp, ebp            ; at the top of the free space        call BEGIN_PM            ; finally, call some well-known label

将以上代码放置到boot文件夹中，写一个Makefile

IMAGE_DIR = IMAGE = ${IMAGE_DIR}nanoimage : boot.bincat $^ > ${IMAGE}.bindd if=${IMAGE}.bin of=${IMAGE}.img bs=1440K count=1 conv=notrunc%.bin : %.asmnasm $< -f bin -o $@ -I boot/# White imageraw :dd if=/dev/zero of=${IMAGE}.img bs=1440K count=1

先使用make raw新建一个1.44M软盘img，然后直接make就可以将asm编译成bin并写入img中。

使用nasm编译出boot.bin，一个512字节的可启动扇区。先将boot.bin文件写入IMAGE.bin，这里由于只有boot，如果有kernel，可以将kernel通过cat一并写入IMAGE.bin中，并通过dd创建1.44M软盘，之后设置好bochs就可以运行了。

这里由于没有kernel，运行会不成功。可以将

call KERNEL_OFFSET

换成jmp $来停止继续运行到未知的地方。

阅读全文

0 0