Debug 内核 Oops
来源:互联网 发布:whatsapp数据迁移 编辑:程序博客网 时间:2024/06/06 19:15
内核的Oops有点像用户态的 段错误(segfaults). 通常,CPU寄存器和调用栈信息会被dump出来。利用这些信息,能够查出来发生问题的代码。
下面用一个例子来说明。
1. 首先,写一个简单的内核模块代码:
#include <linux/kernel.h>#include <linux/module.h>#include <linux/init.h> static void create_oops() { *(int *)0 = 0;} static int __init my_oops_init(void) { printk("oops from the module\n"); create_oops(); return (0);}static void __exit my_oops_exit(void) { printk("Goodbye world\n");} module_init(my_oops_init);module_exit(my_oops_exit);显然,这个模块在被载入的时候,将会出错。
把这段代码保存为 oops.c, 放到 oops 目录下。
然后,编译:
export ARCH=armexport CROSS_COMPILE=arm-linux-gnueabi-make -C /home/charles/code/linux-3.2 M=`pwd` modules
或者写一个Makefile如下:
obj-m := oops.oARCH = armCROSS_COMPILE = arm-linux-gnueabi- EXTRA_CFLAGS = -g -O0all:make ARCH=$(ARCH) CROSS_COMPILE=$(CROSS_COMPILE) -C $(HOME)/code/linux-3.10.28 M=$(PWD) modulesclean:make ARCH=$(ARCH) CROSS_COMPILE=$(CROSS_COMPILE) -C $(HOME)/code/linux-3.10.28 M=$(PWD) clean
会生成一系列的文件:
:~/code/oops$ lsMakefile Module.symvers oops.ko oops.mod.omodules.order oops.c oops.mod.c oops.o
:~/code/oops$ cat Makefile obj-m := oops.o
然后,把 oops.ko 拷贝到目标机(实质是qemu虚拟机)的 /lib/modules/3.2.0/下面:
~ # ls /lib/modules/3.2.0/oops.ko然后,加载 oops:
~ # modprobe oopsDisabling lock debugging due to kernel taintoops: module license 'unspecified' taints kernel.oops from the moduleUnable to handle kernel NULL pointer dereference at virtual address 00000000pgd = 8738c000[00000000] *pgd=673c6831, *pte=00000000, *ppte=00000000Internal error: Oops: 817 [#1] SMPModules linked in: oops(P+)CPU: 0 Tainted: P O (3.2.0 #1)PC is at my_oops_init+0x10/0x1c [oops]LR is at my_oops_init+0xc/0x1c [oops]pc : [<7f002010>] lr : [<7f00200c>] psr: 60000013sp : 873c5eb0 ip : 88820000 fp : 7f002000r10: 873c4000 r9 : 8046d100 r8 : 0000001cr7 : 00000001 r6 : 873f7a80 r5 : 7f000074 r4 : 7f000074r3 : 804554ac r2 : 804554ac r1 : 60000093 r0 : 00000000Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment userControl: 10c53c7d Table: 6738c06a DAC: 00000015Process modprobe (pid: 474, stack limit = 0x873c42f0)Stack: (0x873c5eb0 to 0x873c6000)5ea0: 00000000 80008678 80572140 8009cc3c5ec0: 00000000 00000000 870834a0 88890000 00000001 80058ec0 7f0000bc 7f0000745ee0: 7f000074 873f7a80 00000001 0000001c 7f0000bc 00000024 000000b2 800585885f00: 7f000080 000b70ca 0001d9c1 00000000 800567d4 000a2974 7f0001b0 873c40005f20: 00000068 00000000 00000000 00000000 00000000 00000000 00000000 000000005f40: 88890000 00005d02 888942a0 8889410e 88895c50 870834e0 000001c4 000002145f60: 00000000 00000000 00000025 00000026 0000000f 00000000 0000000d 000000005f80: 00000004 000b70ca 000c30e8 00000000 00000080 8000e2a8 873c4000 000000005fa0: 0001d9c1 8000e100 000b70ca 000c30e8 000c30e8 00005d02 000a2974 000000005fc0: 000b70ca 000c30e8 00000000 00000080 000b70d8 7ec5ff80 000b70ca 0001d9c15fe0: 2acc76a0 7ec5f990 0001d359 2acc76b0 800d0010 000c30e8 00000000 00000000[<7f002010>] (my_oops_init+0x10/0x1c [oops]) from [<80008678>] (do_one_initcall+0xfc/0x164)[<80008678>] (do_one_initcall+0xfc/0x164) from [<80058588>] (sys_init_module+0xd10/0x1a60)[<80058588>] (sys_init_module+0xd10/0x1a60) from [<8000e100>] (ret_fast_syscall+0x0/0x30)Code: e92d4008 e59f000c eb4c4240 e3a00000 (e5800000) ---[ end trace a9cf7df06d0f6920 ]---Segmentation fault
其中能看到 pc, lr(link register)和 sp 寄存器的值和调用堆栈。
my_oops_init+0x10/0x1c表示 符号+偏移/长度
2. 下面开始 debug.
首先,在 host 机器上,把模块加载到 gdb里面:
$ arm-linux-gnueabi-gdb oops.ko GNU gdb (crosstool-NG linaro-1.13.1-2012.04-20120426 - Linaro GCC 2012.04) 7.4-2012.04Copyright (C) 2012 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law. Type "show copying"and "show warranty" for details.This GDB was configured as "--host=i686-build_pc-linux-gnu --target=arm-linux-gnueabi".For bug reporting instructions, please see:<https://bugs.launchpad.net/gcc-linaro>...Reading symbols from /home/charles/code/oops/oops.ko...done.
然后,把符号文件加入进来:
(gdb) add-symbol-file oops.ko 0x7f002000add symbol table from file "oops.ko" at.text_addr = 0x7f002000(y or n) y Reading symbols from /home/charles/code/oops/oops.ko...done.0x7f002000为oops.ko代码段的地址,可以用如下的方式得到:
~ # cat /sys/module/oops/sections/.init.text 0x7f002000根据 pc的值可以知道发生问题的函数,对它进行反汇编:
(gdb) disassemble my_oops_initDump of assembler code for function my_oops_init: 0x00000000 <+0>:push{r3, lr} 0x00000004 <+4>:ldrr0, [pc, #12]; 0x18 <my_oops_init+24> 0x00000008 <+8>:bl0x8 <my_oops_init+8> 0x0000000c <+12>:movr0, #0 0x00000010 <+16>:strr0, [r0] 0x00000014 <+20>:pop{r3, pc} 0x00000018 <+24>:andeqr0, r0, r0End of assembler dump.根据上面的便宜值0x10,可以知道出错时正在执行的代码的位置为:
0x00000000 + 0x10 = 0x00000010, 即是 str r0,[r0]
(gdb) l *0x000000100x10 is in my_oops_init (/home/charles/code/oops/oops.c:6).1#include <linux/kernel.h>2#include <linux/module.h>3#include <linux/init.h>4 5static void create_oops() {6 *(int *)0 = 0;7}8 9static int __init my_oops_init(void) {10 printk("oops from the module\n");
即在第6行。
这种方法其实是把问题搞复杂了,其实,不需要知道oops 模块在内核中的地址.
直接根据
my_oops_init+0x10/0x1c就可以定位到出错的代码在函数 oops_init里的行数。
$ arm-linux-gnueabi-gdb oops.ko GNU gdb (crosstool-NG linaro-1.13.1-2012.04-20120426 - Linaro GCC 2012.04) 7.4-2012.04Copyright (C) 2012 Free Software Foundation, Inc.License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>This is free software: you are free to change and redistribute it.There is NO WARRANTY, to the extent permitted by law. Type "show copying"and "show warranty" for details.This GDB was configured as "--host=i686-build_pc-linux-gnu --target=arm-linux-gnueabi".For bug reporting instructions, please see:<https://bugs.launchpad.net/gcc-linaro>...Reading symbols from /home/charles/code/oops/oops.ko...done.(gdb) disassemble my_oops_my_oops_exit my_oops_init (gdb) disassemble my_oops_init Dump of assembler code for function my_oops_init: 0x00000000 <+0>:push{r3, lr} 0x00000004 <+4>:ldrr0, [pc, #12]; 0x18 <my_oops_init+24> 0x00000008 <+8>:bl0x8 <my_oops_init+8> 0x0000000c <+12>:movr0, #0 0x00000010 <+16>:strr0, [r0] 0x00000014 <+20>:pop{r3, pc} 0x00000018 <+24>:andeqr0, r0, r0End of assembler dump.(gdb) print /x 0x00000000+0x10$1 = 0x10(gdb) list *0x100x10 is in my_oops_init (/home/charles/code/oops/oops.c:6).1#include <linux/kernel.h>2#include <linux/module.h>3#include <linux/init.h>4 5static void create_oops() {6 *(int *)0 = 0;7}8 9static int __init my_oops_init(void) {10 printk("oops from the module\n");(gdb)
参考:
1. http://www.linuxforu.com/2011/01/understanding-a-kernel-oops/
- Debug 内核 Oops
- Linux内核的Oops
- Linux内核的Oops
- Linux内核的Oops
- Linux内核OOPS调试
- Linux内核的Oops
- linux内核中的Oops
- linux内核中的Oops
- Linux内核的Oops
- 调试内核oops
- Linux内核的Oops
- Linux内核的Oops
- Linux内核的Oops
- Linux内核的Oops
- Linux内核的Oops
- Linux内核的Oops
- Linux内核的Oops
- Linux内核的Oops
- cocos2d-x节点(CCDeprecated.h)API
- cocos2d-x节点(cocos2d.h)API
- 大数据时代就在你我身边
- kryo的反序列化异常问题
- 黑马程序员---面向对象(二)
- Debug 内核 Oops
- uva 10132 File Fragmentation
- JSP 笔记
- cocos2d-x节点(CCEventAcceleration.h)API
- cocos2d-x节点(CCEventKeyboard.h)API
- cocos2d-x节点(CCEventListenerAcceleration.h)API
- cocos2d-x节点(CCEventCustom.h)API
- cocos2d-x节点(CCEventListenerCustom.h)API
- cocos2d-x节点(CCEventListenerKeyboard.h)API