Linux kernel crash case总结
来源:互联网 发布:统计学数据分析方法 编辑:程序博客网 时间:2024/05/21 12:18
在内核开发过程中,除了与内核的各种数据结构、各种API打交道之外,接触频率最高的可能就是各种内核crash case了,本文主要对项目中遇到的若干问题进行一下总结,备忘。
1. lockup
linux kernel将lockup分为两类:
soft lockup:当前的cpu一直运行在内核态,使得其他进程没有机会运行;
hard lockup:当前的cpu一直运行在内核态,使得中断没有机会运行。
具体的描述可以参考内核源码树的Documentation/lockup-watchdogs.txt 。
在内核开发中,如果不当地使用spinlock或不当地使用具有spinlock操作的API,或者disable_irq后忘记了enable_irq等等,很容易导致lockup问题。这种现象时比较常见的,例如某些执行路径可能是位于进程上下文中,也可能位于中断异常等上下文中,一不小心就可能悲剧了。除了代码开发人员保持警惕之外,linux kernel已经自带lockup检测工具:watchdog。针对上面的两类型lockup,kernel分别使用了高精度定时器(soft lockup)与perf子系统的NMI中断(hard lockup)来检测报警。
softlock up detector
跟单片机的watchdog类似,需要定期的touch一下看门狗,否则就会bark。在kernel中,是由watchdog kernel thread负责定期touch,hrtimer中断handler负责定期check。伪代码如下所示:
/* (1). watchdog thread */static int watchdog(void *unused){ while (true) { /* --------- touch -------- */ watchdog_touch_ts = current_timestamp; /* wait for hrtimer to wakeup */ sleep(); schedule(); }}/* (2). hrtimer handler */static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer){ /* --------- check -------- */ if ((currnet_timestamp - watchdog_touch_ts) > softlockup_thresh ) warn_or_panic(); /* used for hard lockup detecting */ hrtimer_interrupts++; wake_up_process(watchdog_thread);}
原理并不复杂,注意上面的操作均为percpu,无论是变量或内核线程。考虑这么一种情况,假设某spinlock的BUG导致cpu一直loop,且该cpu的中断是打开的,则hrtimer是可以被触发的,但是watchdog内核线程可能没有机会run起来,则自然会触发上面的报警。另外,上面的hrtimer_interrupts计数是用于辅助hard lockup detecting的。
hard lockup detector
显然,出现这种现象本身就意味着中断被屏蔽了,也就不能使用上面的定时器这种方式来检测,但有一种中断是不可屏蔽的:NMI。这里选择了perf子系统的中断(为什么就是这个中断呢?个人觉得就是恰好满足了吧,因为这个中断本身还得具有定时的性质,perf看起来很符合这个需求)。原理如下:既然需要检测某cpu的中断是否一直处于disable状态,那么就检测一定时间段内该cpu是否产生中断呗?更进一步,上面的hrtimer本身也是中断,那就统计给定时间段内的hrtimer次数就可以了。伪代码如下:
/* perf nmi handler */static void watchdog_overflow_callback(struct perf_event *event, int nmi, ...){ if (hrtimer_interrupts_saved == hrtimer_interrupts) warn_or_panic(); hrtimer_interrupts_saved = hrtimer_interrupts;}
相关参数
上面提到的检测的时间间隔以及是否panic等都是可以通过sysctl来修改的,相关一些参数如下:
kernel.watchdog = 1kernel.watchdog_thresh = 60kernel.softlockup_panic = 0kernel.nmi_watchdog = 1
example
下面举一个简单的softlockup的例子,在模块初始化时连续spin_lock同一个自旋锁,形成dead lock。
/*file: soft_lockup.c */#include <linux/module.h>#include <linux/spinlock.h>static int __init lockup_init(void){ spinlock_t lock; spin_lock_init(&lock); spin_lock(&lock); /* -- BUG -- */ spin_lock(&lock); return 0;}static void lockup_exit(void){ /* do nothing */}module_init(lockup_init);module_exit(lockup_exit);
Makfile:
#file: Makefileifneq ($(KERNELRELEASE),)obj-m += soft_lockup.occflags-y := -WallelseKDIR := /lib/modules/`uname -r`/buildPWD := $(shell pwd)all: $(MAKE) -C $(KDIR) M=$(PWD) modulesclean: $(MAKE) -C $(KDIR) M=$(PWD) cleanendif
在虚拟机上sudo insmod soft_lockup.ko,进程当然会一直卡住,等待一段时间,内核会打印下面的消息,根据调用栈是很容易定位问题的。当然,实际开发过程中,一般soft lockup报警的调用栈并不是root cause,这种情况下一般需要耐心等等,如果后面能继续触发hard lockup panic,那么此时的调用栈可以好好关注一下,一般就是BUG所在。
[ 348.143996] BUG: soft lockup - CPU#7 stuck for 67s! [insmod:777][ 348.144442] Modules linked in: lockup(P+)[ 348.144442] CPU 7 [ 348.144442] Modules linked in: lockup(P+)[ 348.144442] [ 348.144442] Pid: 777, comm: insmod Tainted: P --------------- 2.6.32 #249 QEMU Standard PC (i440FX + PIIX, 1996)[ 348.144442] RIP: 0010:[<ffffffff815825ce>] [<ffffffff815825ce>] _spin_lock+0x1e/0x30[ 348.144442] RSP: 0018:ffff8809ef8c3ee8 EFLAGS: 00000206[ 348.144442] RAX: 0000000000000000 RBX: ffff8809ef8c3ee8 RCX: 0000000000000000[ 348.144442] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff8809ef8c3ef8[ 348.144442] RBP: ffffffff8100da8e R08: 0000000000000000 R09: 0000000000000000[ 348.144442] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000[ 348.144442] R13: 0000000000000001 R14: ffffffff8158107e R15: ffff8809ef8c3e58[ 348.144442] FS: 00007fac98f46700(0000) GS:ffff8800283c0000(0000) knlGS:0000000000000000[ 348.144442] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b[ 348.144442] CR2: 000000000142302f CR3: 00000009efafd000 CR4: 00000000000006e0[ 348.144442] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 348.144442] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 348.144442] Process insmod (pid: 777, threadinfo ffff8809ef8c2000, task ffff8809efda5540)[ 348.144442] Stack:[ 348.144442] ffff8809ef8c3f18 ffffffffa0002024 ffff880900020000 ffffffffa0000020[ 348.144442] <d> 0000000000000001 000000000140e010 ffff8809ef8c3f48 ffffffff81002046[ 348.144442] <d> 000000000140e030 000000000140e010 000000000001582d ffffffffa0000020[ 348.144442] Call Trace:[ 348.144442] [<ffffffffa0002024>] ? lockup_init+0x24/0x2d [lockup][ 348.144442] [<ffffffff81002046>] ? do_one_initcall+0x36/0x1c0[ 348.144442] [<ffffffff810b5a9f>] ? sys_init_module+0xef/0x260[ 348.144442] [<ffffffff8100cf72>] ? system_call_fastpath+0x16/0x1b[ 348.144442] Code: 00 00 00 01 74 05 e8 92 1d d3 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 ba 00 00 01 00 f0 0f c1 17 0f b7 c2 c1 ea 10 39 d0 74 0e f3 90 <0f> 1f 44 00 00 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89 [ 348.144442] Call Trace:[ 348.144442] [<ffffffffa0002024>] ? lockup_init+0x24/0x2d [lockup][ 348.144442] [<ffffffff81002046>] ? do_one_initcall+0x36/0x1c0[ 348.144442] [<ffffffff810b5a9f>] ? sys_init_module+0xef/0x260[ 348.144442] [<ffffffff8100cf72>] ? system_call_fastpath+0x16/0x1b
本来想构造一个在模块初始化的时候disable_irq但没有enable_irq的例子,失败了。看了一下代码,原来在insmod的过程中,执行完驱动的init函数后,会对cpu的状态做一下check,如果发现中断处于disabled状态,则会帮助enable一下。
int do_one_initcall(initcall_t fn){ /* ... */ if (irqs_disabled()) { strlcat(msgbuf, "disabled interrupts ", sizeof(msgbuf)); local_irq_enable(); } /* ... */}
2. memory fault
memory fault,内存错误,即内核无法找到某一逻辑地址对应的物理地址,mmu转换失败了。值得注意的是如果该逻辑地址小于等于PAGE_SIZE(通常为4k),内核日志输出为:“unable to handle kernel NULL pointer dereference”,否则输出为:unable to handle kernel paging request”(见show_fault_oops()
)。
首先构造一个最简单的非法指针访问导致的crash,模块代码如下所示。
/*file: mm_fault.c */#include <linux/module.h>static int __init mm_fault_init(void){ *(int *)12 = 20; return 0;}static void mm_fault_exit(void){ /* do nothing */}module_init(mm_fault_init);module_exit(mm_fault_exit);
编译mm_fault.c,insmod模块,kernel panic。其中dmesg的信息如下面所示(带有c语言注释风格的语句是后面人为添加的),这段输出信息比较直白。首先看一下这些信息分别是由哪些函数输出的,有时候对BUG分析解决也是有帮助的。
整个流程简单描述一下:0x12这个地址触发了缺页异常,执行了调用路径do_page_fault()->bad_area()->__bad_area_nosemaphore()
,这里有一个错误码error_code(enum x86_pf_error_code)来描述异常的具体信息,比如地址访问的操作是内核态还是用户态,读或者写。如果0x12是在用户态访问,这里便会send SIGSEGV信号,也就是用户态常见的段错误。如果是在内核态,接下来就会进入no_context()
。这里遇到了另外一个概念:exception table,内核态允许发生缺页异常,但是只能通过指定的函数口子,比如·copy_to_user()`等。如果发生缺页异常的指令位于这个table中,则是可以fix的;如果不是,基本上就是panic的节奏了。
/* == no_context()-->show_fault_oops() == */[ 29.265559] BUG: unable to handle kernel NULL pointer dereference at 000000000000000c[ 29.266342] IP: [<ffffffffa0002001>] ia_init+0x1/0x13 [ia]/* == no_context()-->show_fault_oops()-->dump_pagetable( ) == */[ 29.266342] PGD 9efcd9067 PUD 9ef4c4067 PMD 0 /* == no_context()-->__die() == */[ 29.266342] Oops: 0002 [#1] SMP [ 29.266342] last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/irq[ 29.266342] CPU 1 /* == no_context()-->__die()-->show_registers()-->print_modules() == */[ 29.266342] Modules linked in: ia(P+)[ 29.266342] [ 29.266342] Pid: 701, comm: insmod Tainted: P --------------- 2.6.32 #249 QEMU Standard PC (i440FX + PIIX, 1996)/* == no_context()-->__die()-->show_registers()-->__show_regs() == */[ 29.266342] RIP: 0010:[<ffffffffa0002001>] [<ffffffffa0002001>] ia_init+0x1/0x13 [ia][ 29.266342] RSP: 0018:ffff8809efc29f18 EFLAGS: 00010246[ 29.266342] RAX: ffff8809efc29fd8 RBX: 0000000001830010 RCX: 0000000000000000[ 29.266342] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000[ 29.266342] RBP: ffff8809efc29f48 R08: 0000000000000000 R09: 0000000000000000[ 29.266342] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0002000[ 29.266342] R13: 0000000000000000 R14: 0000000001830030 R15: 0000000000020000[ 29.266342] FS: 00007f761e255700(0000) GS:ffff880028240000(0000) knlGS:0000000000000000[ 29.266342] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b[ 29.266342] CR2: 000000000000000c CR3: 00000009ef6d9000 CR4: 00000000000006e0[ 29.266342] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 29.266342] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 29.266342] Process insmod (pid: 701, threadinfo ffff8809efc28000, task ffff8809eff46aa0)/* == no_context()-->__die()-->show_registers() ==*/[ 29.266342] Stack:[ 29.266342] ffff8809efc29f48 ffffffff81002046 0000000001830030 0000000001830010[ 29.266342] <d> 00000000000156b2 ffffffffa0000020 ffff8809efc29f78 ffffffff810b5a9f[ 29.266342] <d> 00007fffca8a7a2c 0000000001830030 00000000000156b2 0000000000000003[ 29.266342] Call Trace:[ 29.266342] [<ffffffff81002046>] ? do_one_initcall+0x36/0x1c0[ 29.266342] [<ffffffff810b5a9f>] sys_init_module+0xef/0x260[ 29.266342] [<ffffffff8100cf72>] system_call_fastpath+0x16/0x1b[ 29.266342] Code: <c7> 04 25 0c 00 00 00 14 00 00 00 31 c0 48 89 e5 c9 c3 00 00 00 00 /* == no_context()-->__die() ==*/[ 29.266342] RIP [<ffffffffa0002001>] ia_init+0x1/0x13 [ia][ 29.266342] RSP <ffff8809efc29f18>/* == no_context() == */[ 29.266342] CR2: 000000000000000c/* == no_context()->oops_end()-->oops_exit() == */[ 29.310904] ---[ end trace 6f8574df3551ea11 ]---/* == no_context()->oops_end()-->panic() == */[ 29.312127] Kernel panic - not syncing: Fatal exception/* == no_context()->oops_end()-->panic()-->dump_stack() == */[ 29.313117] Pid: 701, comm: insmod Tainted: P D --------------- 2.6.32 #249[ 29.313117] Call Trace:[ 29.313117] [<ffffffff8106e566>] ? panic+0xd6/0x1d0[ 29.313117] [<ffffffff81070256>] ? kmsg_dump+0x136/0x180[ 29.313117] [<ffffffff8106e198>] ? oops_exit+0x28/0x30[ 29.313117] [<ffffffff815837be>] ? oops_end+0xbe/0x100[ 29.313117] [<ffffffff81047415>] ? no_context+0x165/0x270[ 29.313117] [<ffffffff810477ac>] ? __bad_area_nosemaphore+0xec/0x1d0[ 29.313117] [<ffffffff810478e4>] ? __bad_area+0x54/0x70[ 29.313117] [<ffffffff81047933>] ? bad_area+0x13/0x20[ 29.313117] [<ffffffff81585a23>] ? do_page_fault+0x453/0x550[ 29.313117] [<ffffffff8114e19a>] ? vfree+0x2a/0x30[ 29.313117] [<ffffffff8103d4ac>] ? pvclock_clocksource_read+0x4c/0xe0[ 29.313117] [<ffffffffa0000004>] ? ia_exit+0x4/0xc [ia][ 29.313117] [<ffffffff8103c56c>] ? kvm_clock_read+0x1c/0x20[ 29.313117] [<ffffffff81014629>] ? sched_clock+0x9/0x10[ 29.313117] [<ffffffff810edb07>] ? ring_buffer_time_stamp+0x7/0x10[ 29.313117] [<ffffffffa0002000>] ? ia_init+0x0/0x13 [ia][ 29.313117] [<ffffffff81582ac5>] ? page_fault+0x25/0x30[ 29.313117] [<ffffffffa0002000>] ? ia_init+0x0/0x13 [ia][ 29.313117] [<ffffffffa0002001>] ? ia_init+0x1/0x13 [ia][ 29.313117] [<ffffffff81002046>] ? do_one_initcall+0x36/0x1c0[ 29.313117] [<ffffffff810b5a9f>] ? sys_init_module+0xef/0x260[ 29.313117] [<ffffffff8100cf72>] ? system_call_fastpath+0x16/0x1b
引起内存错误的原因很多:内存越界,逻辑错误,未初始化等。一般静态分析一下代码,如果不能解决,可能就麻烦了。
3. invalid opcode
invalid opcode,操作指令非法。如果在开发过程中遇到这个问题,可以从两个角度去分析:
(1). 首先就要怀疑是不是有函数指针在传递的过程中被改写了,也就是说寄存器RIP指向了一个意料之外的地址,如例1所示。如果方便的话,可以在每次调用函数指针前,把指针的值给记录下来,比如printk。另外,函数指针被改写时也可能直接触发上面的memory fault。
(2). 另外一种可能:函数调用跳转的地址是正确的,也就是说RIP里面的值没有问题,那就可能是该地址的内存被改写了,如下面的例2所示。那么如何分析是否属于这种情况呢?这就需要用到objdump工具了,具体方法见下文。
例1,非法RIP:
/*file: invalid_op.c */#include <linux/module.h>#include <linux/blkdev.h>#include <linux/kallsyms.h>unsigned long long data = 0xffffffffffffffffUL;typedef void (*fn)(void);static int __init invalid_op_init(void){ fn p= (fn) (&data); pr_info("p : 0x%p\n", p); /* wrong RIP */ p(); return 0;}static void invalid_op_exit(void){ /* do nothing */}module_init(invalid_op_init);module_exit(invalid_op_exit);
例1结果:
[ 17.730718] p : 0xffffffffa0000030[ 17.731467] invalid opcode: 0000 [#1] SMP [ 17.732378] last sysfs file: /sys/devices/pci0000:00/0000:00:03.0/irq[ 17.733680] CPU 3 [ 17.734080] Modules linked in: ia1(P+)[ 17.734949] [ 17.735274] Pid: 693, comm: insmod Tainted: P --------------- 2.6.32 #249 QEMU Standard PC (i440FX + PIIX, 1996)[ 17.737578] RIP: 0010:[<ffffffffa0000030>] [<ffffffffa0000030>] data+0x0/0xffffffffffffffdc [ia1][ 17.739447] RSP: 0018:ffff8809ef63bf10 EFLAGS: 00010296[ 17.740583] RAX: 000000000000002c RBX: 0000000000b95010 RCX: 0000000000000000[ 17.742039] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246[ 17.743464] RBP: ffff8809ef63bf18 R08: 0000000000000000 R09: ffffffff81b907e0[ 17.744913] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0002000[ 17.746332] R13: 0000000000000000 R14: 00007faf99d59010 R15: 0000000000040000[ 17.747777] FS: 00007faf99d9b700(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000[ 17.749378] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b[ 17.750583] CR2: 00007faf99d8d00f CR3: 00000009efc1a000 CR4: 00000000000006e0[ 17.752058] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 17.753491] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 17.754964] Process insmod (pid: 693, threadinfo ffff8809ef63a000, task ffff8809efe35540)[ 17.756642] Stack:[ 17.757078] ffffffffa000201e ffff8809ef63bf48 ffffffff81002046 00007faf99d59010[ 17.758583] <d> 0000000000b95010 00000000000345c2 ffffffffa0000040 ffff8809ef63bf78[ 17.760258] <d> ffffffff810b5a9f 00007fffbd542a2a 00007faf99d59010 00000000000345c2[ 17.762015] Call Trace:[ 17.762521] [<ffffffffa000201e>] ? ia1_init+0x1e/0x22 [ia1][ 17.763687] [<ffffffff81002046>] do_one_initcall+0x36/0x1c0[ 17.764862] [<ffffffff810b5a9f>] sys_init_module+0xef/0x260[ 17.765991] [<ffffffff8100cf72>] system_call_fastpath+0x16/0x1b[ 17.767198] Code: 1f 44 00 00 c9 c3 90 3c 36 3e 70 20 3a 20 30 78 25 70 0a 00 00 00 00 00 00 00 00 05 00 00 a0 ff ff ff ff 06 04 00 00 00 00 00 00 <ff> ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 01 00 00 00 00 [ 17.772684] RIP [<ffffffffa0000030>] data+0x0/0xffffffffffffffdc [ia1][ 17.774042] RSP <ffff8809ef63bf10>[ 17.774736] ---[ end trace 177d1b7d88c28b7f ]---[ 17.775740] Kernel panic - not syncing: Fatal exception[ 17.776690] Pid: 693, comm: insmod Tainted: P D --------------- 2.6.32 #249[ 17.776690] Call Trace:[ 17.776690] [<ffffffff8106e566>] ? panic+0xd6/0x1d0[ 17.776690] [<ffffffff81412576>] ? netoops+0x1c6/0x2a0[ 17.776690] [<ffffffff81070256>] ? kmsg_dump+0x136/0x180[ 17.776690] [<ffffffff8106e198>] ? oops_exit+0x28/0x30[ 17.776690] [<ffffffff815837be>] ? oops_end+0xbe/0x100[ 17.776690] [<ffffffff8101103b>] ? die+0x5b/0x90[ 17.776690] [<ffffffff81582eb6>] ? do_trap+0x136/0x150[ 17.776690] [<ffffffff8100eaa5>] ? do_invalid_op+0x95/0xb0[ 17.776690] [<ffffffff8103c56c>] ? kvm_clock_read+0x1c/0x20[ 17.776690] [<ffffffffa0002000>] ? ia1_init+0x0/0x22 [ia1][ 17.776690] [<ffffffff8106f82c>] ? printk+0x6c/0x70[ 17.776690] [<ffffffffa0002000>] ? ia1_init+0x0/0x22 [ia1][ 17.776690] [<ffffffff8100dd7b>] ? invalid_op+0x1b/0x20[ 17.776690] [<ffffffffa0002000>] ? ia1_init+0x0/0x22 [ia1][ 17.776690] [<ffffffffa000201e>] ? ia1_init+0x1e/0x22 [ia1][ 17.776690] [<ffffffff81002046>] ? do_one_initcall+0x36/0x1c0[ 17.776690] [<ffffffff810b5a9f>] ? sys_init_module+0xef/0x260[ 17.776690] [<ffffffff8100cf72>] ? system_call_fastpath+0x16/0x1b
例2. 代码段被改写
/*file: invalid_op2.c */#include <linux/module.h>#include <linux/blkdev.h>#include <linux/kallsyms.h>void fn(void) { pr_info("hello world\n");} static int __init invalid_op2_init(void){ unsigned long *p = (int *)fn; fn(); *p = 0x234ff; fn(); return 0; } static void invalid_op2_exit(void){ /* do nothing */} module_init(invalid_op2_init);module_exit(invalid_op2_exit);
例2结果:
[ 19.462662] Pid: 787, comm: insmod Tainted: P --------------- 2.6.32 #249 QEMU Standard PC (i440FX + PIIX, 1996)[ 19.462662] RIP: 0010:[<ffffffffa0000010>] [<ffffffffa0000010>] fn+0x0/0x1c [ia1][ 19.462662] RSP: 0018:ffff8809f192df10 EFLAGS: 00010296[ 19.462662] RAX: 0000000000000021 RBX: 0000000002066010 RCX: 0000000000000000[ 19.462662] RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246[ 19.462662] RBP: ffff8809f192df18 R08: 0000000000000000 R09: ffffffff81b907e0[ 19.462662] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffa0002000[ 19.462662] R13: 0000000000000000 R14: 00007f2f3f93e010 R15: 0000000000040000[ 19.462662] FS: 00007f2f3f980700(0000) GS:ffff8800282c0000(0000) knlGS:0000000000000000[ 19.462662] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b[ 19.462662] CR2: 0000000000000021 CR3: 00000009f19e5000 CR4: 00000000000006e0[ 19.462662] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000[ 19.462662] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400[ 19.462662] Process insmod (pid: 787, threadinfo ffff8809f192c000, task ffff8809ef5c6000)[ 19.462662] Stack:[ 19.462662] ffffffffa0002019 ffff8809f192df48 ffffffff81002046 00007f2f3f93e010[ 19.462662] <d> 0000000002066010 00000000000346f8 ffffffffa0000060 ffff8809f192df78[ 19.462662] <d> ffffffff810b5a9f 00007fff40420a2a 00007f2f3f93e010 00000000000346f8[ 19.462662] Call Trace:[ 19.462662] [<ffffffffa0002019>] ? invalid_op2_init+0x19/0x1d [ia1][ 19.462662] [<ffffffff81002046>] do_one_initcall+0x36/0x1c0[ 19.462662] [<ffffffff810b5a9f>] sys_init_module+0xef/0x260[ 19.462662] [<ffffffff8100cf72>] system_call_fastpath+0x16/0x1b[ 19.462662] Code: <ff> 34 02 00 00 00 00 00 00 48 c7 c7 2c 00 00 a0 31 c0 e8 99 f7 06 [ 19.462662] RIP [<ffffffffa0000010>] fn+0x0/0x1c [ia1][ 19.462662] RSP <ffff8809f192df10>[ 19.462662] CR2: 0000000000000021[ 19.519511] ---[ end trace 7737e5eefd06890c ]---[ 19.520679] Kernel panic - not syncing: Fatal exception[ 19.521620] Pid: 787, comm: insmod Tainted: P D --------------- 2.6.32 #249[ 19.521620] Call Trace:[ 19.521620] [<ffffffff8106e566>] ? panic+0xd6/0x1d0[ 19.521620] [<ffffffff81412576>] ? netoops+0x1c6/0x2a0[ 19.521620] [<ffffffff81070256>] ? kmsg_dump+0x136/0x180[ 19.521620] [<ffffffff8106e198>] ? oops_exit+0x28/0x30[ 19.521620] [<ffffffff815837be>] ? oops_end+0xbe/0x100[ 19.521620] [<ffffffff81047415>] ? no_context+0x165/0x270[ 19.521620] [<ffffffff8137e472>] ? wait_for_xmitr+0x62/0xe0[ 19.521620] [<ffffffff810477ac>] ? __bad_area_nosemaphore+0xec/0x1d0[ 19.521620] [<ffffffff810478e4>] ? __bad_area+0x54/0x70[ 19.521620] [<ffffffff81047933>] ? bad_area+0x13/0x20[ 19.521620] [<ffffffff81585a23>] ? do_page_fault+0x453/0x550[ 19.521620] [<ffffffffa0000004>] ? invalid_op2_exit+0x4/0x10 [ia1][ 19.521620] [<ffffffffa0002000>] ? invalid_op2_init+0x0/0x1d [ia1][ 19.521620] [<ffffffff8106f82c>] ? printk+0x6c/0x70[ 19.521620] [<ffffffffa0002000>] ? invalid_op2_init+0x0/0x1d [ia1][ 19.521620] [<ffffffff81582ac5>] ? page_fault+0x25/0x30[ 19.521620] [<ffffffffa0002000>] ? invalid_op2_init+0x0/0x1d [ia1][ 19.521620] [<ffffffffa0000010>] ? fn+0x0/0x1c [ia1][ 19.521620] [<ffffffffa0002019>] ? invalid_op2_init+0x19/0x1d [ia1][ 19.521620] [<ffffffff81002046>] ? do_one_initcall+0x36/0x1c0[ 19.521620] [<ffffffff810b5a9f>] ? sys_init_module+0xef/0x260[ 19.521620] [<ffffffff8100cf72>] ? system_call_fastpath+0x16/0x1b
注意上面dump出来的code字段:
Code: <ff> 34 02 00 00 00 00 00 00 48 c7 c7 2c 00 00 a0 31 c0 e8 99 f7 06
在源码中修改了fn指向的指令为0x234ff,由于是little endian字节序,转换一下即为上面的输出。假设RIP的值是正确的,使用objdump工具看一下真正的指令是什么样的:
$ objdump -d invalid_op2.ko
0000000000000010 <fn>: 10: 55 push %rbp 11: 48 89 e5 mov %rsp,%rbp 14: e8 00 00 00 00 callq 19 <fn+0x9> 19: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
很明显,“误操作”将“55 48 89 e5 e8 00 00 00”变成了”ff 34 02 00 00 00 00 00”。如果能走到这一步,基本上也就解决了问题的一半。
4. 未完待续…
- Linux kernel crash case总结
- Linux: Kernel Crash Dumps
- Linux Kernel Crash--hung_task_timeout_secs
- Linux Kernel Crash Debug
- Linux kernel crash dumps with crash
- Introduction - Linux Kernel Crash Dump
- Linux kernel crash analysis example
- Linux kernel crash analysis example (2)
- Linux kernel crash and analysis example (3)
- linux kernel crash问题分析解决
- linux kernel总结
- A quick overview of Linux kernel crash dump analysis
- 《Linux kernel panic》内核模块空指针导致系统crash
- Linux内核调试:kdump、vmcore、crash、kernel-debuginfo
- Analyzing Linux kernel crash dumps with crash - The one tutorial that has it all
- Analyzing Linux kernel crash dumps with crash - The one tutorial that has it all
- kernel crash analysis
- crash 调试kernel
- 如何从菜鸟成长成spark大数据高手(转载+自我笔记)
- C#窗体实例:记事本
- mysql中的升序和降序以及一个字段升序和一个字段降序
- iOS 自定义navigationBar
- heap和stack的区别
- Linux kernel crash case总结
- Android IPC之Messenger和AIDL(android开发艺术探索随笔)
- Spark RDD中Transformation的mapValues、subtract、sample、takeSample详解
- HDOJ 2157 How many ways?? (DP)
- 卷积的意义
- ES6定义函数的新方法(generator和箭头函数)
- LeetCode Merge k Sorted Lists
- 链表求和(LintCode)
- 修改rootfs包的相关操作