内核初始化小记 (arm platform)

来源：互联网发布：javbus备用域名编辑：程序博客网时间：2024/06/06 01:47

Kernel Initialization

1. boot初始化

下表说明了内核build后生成的一些boot文件。

Component

Function/Description

vmlinux

Kernel proper, in ELF format, including symbols, comments, debug info (if compiled with -g) and architecture-generic components.

System.map

Text-based kernel symbol table for vmlinux module.

arch/arm/boot/Image

Binary kernel module, stripped of symbols, notes, and comments.

arch/arm/boot/compressed/head.o

ARM-specific startup code generic to ARM processors. It is this object that is passed control by the bootloader.

arch/arm/boot/compressed/piggy.gz

The file Image compressed with gzip.

arch/arm/boot/compressed/piggy.o

The file piggy.gz in assembly language format so it can be linked with a subsequent object, misc.o (see the text).

arch/arm/boot/compressed/misc.o

Routines used for decompressing the kernel image (piggy.gz), and the source of the familiar boot message: "Uncompressing Linux … Done" on some architectures.

arch/arm/boot/compressed/head-xscale.o

Processor initialization specific to the XScale processor family.

arch/arm/boot/compressed/big-endian.o

Tiny assembly language routine to switch the XScale processor into big-endian mode.

arch/arm/boot/compressed/vmlinux

Composite kernel image. Note this is an unfortunate choice of names, because it duplicates the name for the kernel proper; the two are not the same. This binary image is the result when the kernel proper is linked with the objects in this table. See the text for an explanation.

arch/arm/boot/zImage

Final composite kernel image loaded by bootloader. See the following text.

下图说明了内核image的压缩过程。

Image其实就是vmlinux.bin，它的生成命令如下：

objcopy -O binary -R .note -R .comment -S vmlinux arch/arm/boot/Image

(the -O option tells objcopy to generate a binary file, the -R option removes the ELF sections named .note and .comment, and the -S option is the flag to strip debugging symbols)

piggy.gz其实就是vmlinux.bin.gz，它的生成命令如下：

gzip -f -9 < Image > piggy.gz

Bootstrap Loader

许多体系架构使用bootstrap loader来加载kernel image到内存。Bootstrap loader一般检查校验kernel image，执行kernel image的解压缩以及重定位。

Bootstrap loader与bootloader的区别是：bootloader控制板级启动初始化，不一来于内核代码。而bootstap loader是bootloader与kernel image之间的黏合剂(glue)。（需要说明的是，有些架构没有使用boostrap loader，这部分工作由bootloader完成，比如uboot。uboot可以将需要的参数通过指定的地址(uboot_info_stack)传给内核）

下图表示了启动时的控制流程：

The bootstrap loader prepended to the kernel image has a single primary responsibility: to create the proper environment to decompress and relocate the kernel, and pass control to it.

Control is passed from the bootstrap loader directly to the kernel proper, to a module called head.o for most architectures. It is an unfortunate historical artifact that both the bootstrap loader and the kernel proper contain a module called head.o because it is a source of confusion to the new embedded Linux developer. The head.o module in the bootstrap loader might be more appropriately called kernel_bootstrap_loader_head.o, although I doubt that the kernel developers would accept this patch.

When the bootstraploader has completed its job, control is passed to the kernel proper's head.o, and from there to start_kernel() in main.c.

Kernel Entry Point: head.o

它的代码位于arch/<cpu_arch>/kernel/head.S （有些平台可能在其他目录)

head.S的功能大概如下：

· Checks for valid processor and architecture

· Creates initial page table entries

· Enables the processor's memory management unit (MMU)

· Establishes limited error detection and reporting

· Jumps to the start of the kernel proper, main.c

Kernel startup: main.c

在head.S里有一句汇编，类似如下：

b start_kernel

start_kernel()位于init/mai.c里，是内核的第一个与平台无关的C代码起始函数。

Architecture setup

由start_kernel()调用setup_arch(&command_line)。

This statement calls an architecture-specific setup routine responsible for performing initialization tasks common across each major architecture.

Among other functions, setup_arch() calls functions that identify the specific CPU and provides a mechanism for calling high-level CPU-specific initialization routines.

One such function, called directly by setup_arch(), is setup_processor(), found in .../arch/arm/kernel/setup.c. This function verifies the CPU ID and revision, calls CPU-specific initialization functions, and displays several lines of information on the console during boot.

2. Kernel command line

__setup macro

* Setup a list of consoles. Called from init/main.c

static int __init console_setup(char *str)

{

char name[sizeof(console_cmdline[0].name)];

char*s, *options;

int idx;

* Decode str into name, index, options.

return 1;

}

__setup("console=", console_setup);

You can think of this macro as a registration function for the kernel command-line console parameter. In effect, it says: When the console= string is encountered on the kernel command line, invoke the function represented by the second __setup macro argumentin this case, the console_setup() function.

通过__setup宏，给每个cmdline的参数设置一个对应的处理函数。

__setup宏的实现机制

#define __setup_param(str, unique_id, fn, early) /

static char __setup_str_##unique_id[] __initdata = str; /

static struct obs_kernel_param __setup_##unique_id /

__attribute_used__ /

__attribute__((__section__(".init.setup"))) /

__attribute__((aligned((sizeof(long))))) /

= { __setup_str_##unique_id, fn, early }

#define __setup_null_param(str, unique_id) /

__setup_param(str, unique_id, NULL, 0)

#define __setup(str, fn/

__setup_param(str, fn, fn, 0)

struct obs_kernel_param {

const char *str;

int (*setup_func)(char *);

int early;

};

下面一段解释了上面的示例含义：

First, the compiler generates an array of characters called __setup_str_console_ setup[] initialized to contain the string console=. Next, the compiler generates a structure that contains three members: a pointer to the kernel command line string (the array just declared), the pointer to the setup function itself, and a simple flag. The key to the magic here is the section attribute attached to the structure. This attribute instructs the compiler to emit this structure into a special section within the ELF object module, called .init.setup. During the link stage, all the structures defined using the __setup macro are collected and placed into this .init.setup section, in effect creating an array of these structures.

这里最重要的，就是对.init.setup的解释，它实际是在vmlinux.lds.S里定义的一个section，类似一个全局的数组。

下面的内核函数说明了内核怎样使用这段section里的数据：

1 extern struct obs_kernel_param __setup_start[], __setup_end[];

3 static int __init obsolete_checksetup(char *line)

4 {

5 struct obs_kernel_param *p;

7 p = __setup_start;

8 do {

9 int n = strlen(p->str);

10 if (!strncmp(line, p->str, n)) {

11 if (p->early) {

12 /* Already done in parse_early_param? (Needs

13 * exact match on param part) */

14 if (line[n] == '/0' || line[n] == '=')

15 return 1;

16 } else if (!p->setup_func) {

17 printk(KERN_WARNING "Parameter %s is obsolete,"

18 " ignored/n", p->str);

19 return 1;

20 } else if (p->setup_func(line + n))

21 return 1;

22 }

23 p++;

24 } while (p < __setup_end);

25 return 0;

26 }

实际上的调用关系是：

start_kernel()->parse_args()->parse_one()->unknown_bootoption()->obsolete_checksetup()

如果__setup()最后一个参数flag为1时，即early域为1，则调用关系为：

start_kernel()->parse_early_param()->parse_args()->parse_one()->do_early_param()

在kernel command line里传入debug，可以打开内核debug控制台(consolelog_level=10)

3. Subsystem Initialization

*_initcall宏定义如下

#define __define_initcall(level,fn,id) /

static initcall_t __initcall_##fn##id __used /

__attribute__((__section__(".initcall" level ".init"))) = fn

* A "pure" initcall has no dependencies on anything else, and purely

* initializes variables that couldn't be statically initialized.

* This only exists for built-in code, not for modules.

#define pure_initcall(fn) __define_initcall("0",fn,0)

#define core_initcall(fn) __define_initcall("1",fn,1)

#define core_initcall_sync(fn) __define_initcall("1s",fn,1s)

#define postcore_initcall(fn) __define_initcall("2",fn,2)

#define postcore_initcall_sync(fn) __define_initcall("2s",fn,2s)

#define arch_initcall(fn) __define_initcall("3",fn,3)

#define arch_initcall_sync(fn) __define_initcall("3s",fn,3s)

#define subsys_initcall(fn) __define_initcall("4",fn,4)

#define subsys_initcall_sync(fn) __define_initcall("4s",fn,4s)

#define fs_initcall(fn) __define_initcall("5",fn,5)

#define fs_initcall_sync(fn) __define_initcall("5s",fn,5s)

#define rootfs_initcall(fn) __define_initcall("rootfs",fn,rootfs)

#define device_initcall(fn) __define_initcall("6",fn,6)

#define device_initcall_sync(fn) __define_initcall("6s",fn,6s)

#define late_initcall(fn) __define_initcall("7",fn,7)

#define late_initcall_sync(fn) __define_initcall("7s",fn,7s)

#define __initcall(fn) device_initcall(fn)

#define __exitcall(fn) /

static exitcall_t __exitcall_##fn __exit_call = fn

#define console_initcall(fn) /

static initcall_t __initcall_##fn /

__used __section(.con_initcall.init) = fn

#define security_initcall(fn) /

static initcall_t __initcall_##fn /

__used __section(.security_initcall.init) = fn

/**

* module_init() - driver initialization entry point

* @x: function to be run at kernel boot time or module insertion

* module_init() will either be called during do_initcalls() (if

* builtin) or at module insertion time (if a module). There can only

* be one per module.

#define module_init(x) __initcall(x);

/**

* module_exit() - driver exit entry point

* @x: function to be run when driver is removed

* module_exit() will wrap the driver clean-up code

* with cleanup_module() when used with rmmod when

* the driver is a module. If the driver is statically

* compiled into the kernel, module_exit() has no effect.

* There can only be one per module.

#define module_exit(x) __exitcall(x);

*_initcall的实现原理与__setup宏类似：these macros declare a data item based on the name of the function, and use the section attribute to place this data item into a uniquely named section of the vmlinux ELF file.

存放这些数据的段名叫.initcallN.init，N表示段的调用优先级，N越小，调用优先级越高。

In case you were wondering about the debug print statements in do_initcalls(), you can watch these calls being executed during bootup by setting the kernel command line parameter initcall_debug. This command line parameter enables the printing of the debug information of initcalls.

实际调用这些宏函数的调用关系为：

start_kernel()->res_init()->kernel_init()->do_basic_setup()->do_initcalls()->do_one_initcall()

可见，*_initcall宏的调用已经到了start_kernel()函数的末尾了。

4. The init Thread

init线程和idle线程的起源：After start_kernel() performs some basic kernel initialization, calling early initialization functions explicitly by name, the very first kernel thread is spawned. This thread eventually becomes the kernel thread called init(), with a process id (PID) of 1. As you will learn, init() becomes the parent of all Linux processes in user space. At this point in the boot sequence, two distinct threads are running: that represented by start_kernel() and now init(). The former goes on to become the idle process, having completed its work. The latter becomes the init process.

它的代码实现如下：

static void noinline __init_refok rest_init(void)

__releases(kernel_lock)

{

int pid;

kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);

numa_default_policy();

pid = kernel_thread(kthreadd, NULL, CLONE_FS | CLONE_FILES);

kthreadd_task = find_task_by_pid_ns(pid, &init_pid_ns);

unlock_kernel();

* The boot idle thread must execute schedule()

* at least once to get things moving:

init_idle_bootup_task(current);

preempt_enable_no_resched();

schedule();

preempt_disable();

/* Call into cpu_idle with preempt disabled */

cpu_idle();

}

从代码里可以看出，其实有三个线程，一个是当前的start_kernel所在的线程，最后调用cpu_idle()；

一个是kernel_init线程，最终调用run_init_process()函数，成为init进程；

一个是kthreadd线程，用来启动其他内核线程（其他的内核线程通过kthread_create()来实现)。

参考文档：

Embedded.Linux.Primer.A.Practical.Real.World.Approach.Sep.2006.chm