跟踪分析Linux内核的启动过程

来源：互联网发布：淘宝网广场舞蹈鞋编辑：程序博客网时间：2024/05/21 16:06

杨金龙 + 原创作品转载请注明出处 + 《Linux内核分析》MOOC课程http://mooc.study.163.com/course/USTC-1000029000

本周是Linux内核分析课程的第三周，本周课程的主要内容是使用gdb调试工具跟踪调试Linux3.18.6版本内核的启动过程，具体实验内柔参照实验楼https://www.shiyanlou.com/courses/195/labs/725/document

实验步骤

使用实验楼的虚拟机打开终端,进入实验环境文件夹
cd LinuxKernel/
使用gdb跟踪调试内核
qemu -kernel linux-3.18.6/arch/x86/boot/bzImage -initrd rootfs.img -s -S

-S 冻结CPU，输入“c”开始执行

-s 默认使用1234端口来调试
开启另外一个终端窗口，使用gdb跟踪调试
gdb
（gdb）file linux-3.18.6/vmlinux

在gdb界面中targe remote之前加载符号表
（gdb）target remote:1234

建立gdb和gdbserver之间的连接,按c 让qemu上的Linux继续运行
（gdb）break start_kernel

设置断点，运行到断掉处通过指令list可以显示当前断点前后内容

实验结果截图

调试开始

start_kernel

rest_init

kernel_init

run_ init_process

内核启动过程分析

start_kernel

在start_kernel代码中，有两个个模块代码是本次分析需要着重关注的:

set_task_stack_end_magic(&init_task);

init-task即手工创建的PCB，0号进程即最终的idle进程

struct task_struct init_task = INIT_TASK(init_task); variable in typeref:struct:task_struct

init-task进程在Linux中属于一个比较特殊的进程，它是内核开发者人为制造出来的，而不是其他进程通过do_fork来完成。它其实就是一个task-struct，与用户进程的task-struct一样，task-struct中保存了一个进程的所有基本信息，如进程状态，栈起始地址，进程号pid等；

rest_init();

剩余的初始化，至此，内核已经开始工作了。在rest-init()中，有kernel-init()，通过调用run-init-process，产生了第一个用户态进程，1号进程，默认在根目录下。rest-init()中，cpu-starup-entry函数调用cpu-idle-loop使得init-task空闲为idle进程，即0号进程

rest_init

static noinline void __init_refok rest_init(void){    int pid; /*定义pid变量存放进程号*/    rcu_scheduler_starting(); /*RCU(Read-Copy Update)锁机制启动。*/    /*    * We need to spawn init first so that it obtains pid 1, however    * the init task will end up wanting to create kthreads, which, if    * we schedule it before we create kthreadd, will OOPS.    */    kernel_thread(kernel_init, NULL, CLONE_FS); /*init进程在此时创建好了，但是现在还不能调度它。*/    numa_default_policy(); /*设定NUMA（Non-Uniform Memory Access Architecture）系统的内存访问策略为默认。*/    pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES); /*创建kthreadd内核线程，它的作用是管理和调度其它内核线程。*/    rcu_read_lock();    kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns); /*获取kthreadd的线程信息，获取完成说明kthreadd已经创建成功。*/    rcu_read_unlock();    complete(&kthreadd_done); /*通过一个complete变量（kthreadd_done）来通知kernel_init线程。*/    /*    * The boot idle thread must execute schedule()    * at least once to get things moving:    */    init_idle_bootup_task(current);    schedule_preempt_disabled();    /* Call into cpu_idle with preempt disabled */    cpu_startup_entry(CPUHP_ONLINE);}

就像上面说到的，rest_init函数执行了内核初始化的结尾工作，本次着重关注其中的以下几段代码：

kernel_thread(kernel_init, NULL, CLONE_FS);

查看源码

pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags){    return do_fork(flags|CLONE_VM|CLONE_UNTRACED, (unsigned long)fn,    (unsigned long)arg, NULL, NULL);}

从上面的代码可以看到，这里相当于fork出了新进程来执行kernel-init函数。kernel-init函数（低版本内核中这个函数名为init，为了区分init进程所以将其改为了kernel-init）;在kernel-init函数(/linux-3.18.6/init/main.c)正式启动了init进程：

static int __ref kernel_init(void *unused){    int ret;    kernel_init_freeable();     /* need to finish all async __init code before freeing the memory */    async_synchronize_full();    free_initmem();    mark_rodata_ro();    system_state = SYSTEM_RUNNING;    numa_default_policy();    flush_delayed_fput();    if (ramdisk_execute_command) {        ret = run_init_process(ramdisk_execute_command);        if (!ret)            return 0;        pr_err("Failed to execute %s (error %d)\n",           ramdisk_execute_command, ret);    }    /*    * We try each of these until one succeeds.    *    * The Bourne shell can be used instead of init  if we are    * trying to recover a really broken machine.    */    if (execute_command) {        ret = run_init_process(execute_command);        if (!ret)            return 0;        pr_err("Failed to execute %s (error %d).  Attempting defaults...\n",        execute_command, ret);    }    if (!try_to_run_init_process("/sbin/init") ||        !try_to_run_init_process("/etc/init") ||        !try_to_run_init_process("/bin/init") ||        !try_to_run_init_process("/bin/sh"))        return 0;    panic("No working init found.  Try passing init= option to kernel. "      "See Linux Documentation/init.txt for guidance.");}

至此rest_init()函数启动了一个大名鼎鼎的init进程，也就是1号进程。

pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);

上面的这句代码，可以知道folk了一个新的进程来执行函数kthreadd，源代码如下：

int kthreadd(void *unused){    struct task_struct *tsk = current;    /* Setup a clean context for our children to inherit. */    set_task_comm(tsk, "kthreadd");    ignore_signals(tsk);    set_cpus_allowed_ptr(tsk, cpu_all_mask);    set_mems_allowed(node_states[N_MEMORY]);    current->flags |= PF_NOFREEZE;    for (;;) {        set_current_state(TASK_INTERRUPTIBLE);        if (list_empty(&kthread_create_list))            schedule();            __set_current_state(TASK_RUNNING);        spin_lock(&kthread_create_lock);        while (!list_empty(&kthread_create_list)) {            struct kthread_create_info *create;            create = list_entry(kthread_create_list.next,                    struct kthread_create_info, list);            list_del_init(&create->list);            spin_unlock(&kthread_create_lock);            create_kthread(create);            spin_lock(&kthread_create_lock);        }        spin_unlock(&kthread_create_lock);    }    return 0;}

通过网络查询，可以知道kthreadd内核线程只有一个，它的作用是管理调度其它的内核线程。它在内核初始化的时候被创建，会循环运行一个叫做kthreadd的函数，该函数的作用是运行kthread-create-list全局链表中维护的kthread。可以调用kthread-create创建一个kthread，它会被加入到kthread-create-list链表中，同时kthread-create会weak up kthreadd-task。kthreadd在执行kthread会调用老的接口——kernel-thread运行一个名叫“kthread”的内核线程去运行创建的kthread，被执行过的kthread会从kthread-create-list链表中删除，并且kthreadd会不断调用scheduler 让出CPU。这个线程不能关闭。这个进程也就是2号进程。

cpu_startup_entry(CPUHP-ONLINE);

在rest_init函数的最后调用了这个函数，源代码如下：

void cpu_startup_entry(enum cpuhp_state state){    /*    * This #ifdef needs to die, but it's too late in the cycle to    * make this generic (arm and sh have never invoked the canary    * init for the non boot cpus!). Will be fixed in 3.11    */#ifdef CONFIG_X86    /*    * If we're the non-boot CPU, nothing set the stack canary up    * for us. The boot CPU already has it initialized but no harm    * in doing it again. This is a good place for updating it, as    * we wont ever return from this function (so the invalid    * canaries already on the stack wont ever trigger).    */    boot_init_stack_canary();#endif    arch_cpu_idle_prepare();    cpu_idle_loop();}

其中，cpu-idle-loop()实际是一个while无限循环，也就是说，0号进程在fold了1号进程和2号进程并且做了其余的启动工作之后，最后“进化”成为了idle进程。至此，由start_kernel()函数所开始的内核启动告一段落，系统此时已经可以”正常“的接受任务进行工作了。启动流程如下图所示：

总结

本次实验最核心的内容就是理解内核启动过程中0号进程，1号进程，2号进程的创建过程，通过自己阅读代码以及与阅读参考资料，可以知道0号进程也就是init-task,这个进程是由内核编写人员手动创建，并且在执行rest-init的过程中，通过调用kernel_thread函数创建两个进程，分别执行kernel-init和kthreadd函数，前者就是1号进程，最终启动init进程，后者是系统的2号进程，它是一个内核进程，管理内核资源。启动过程可以

参考资料

Linux进程和内核级进程的一些知识–http://blog.csdn.net/zgrjkflmkyc/article/details/49275551

Linux内核中的init_task进程和idle进程–http://blog.csdn.net/hardy_2009/article/details/7383815

分析Linux内核的启动过程–http://blog.csdn.net/myfather103/article/details/44337461

《Linux内核分析》（三）——跟踪分析Linux内核的启动过程–http://blog.csdn.net/FIELDOFFIER/article/details/44518597

“Linux内核分析”实验报告–https://www.shiyanlou.com/courses/reports/982406

0 0