Redhat Crash Utility-Ramdump
来源:互联网 发布:mac截图快捷键设置 编辑:程序博客网 时间:2024/05/17 06:19
edit by liaoye@2014/9/16
http://blog.csdn.net/paul_liao
Crash utility是redhat提供的开源的ramdump解析工具,官方网站:http://people.redhat.com/anderson/ ,可以下载源码编译。展讯、Marvell和MTK 平台的ramdump可以用Crash utility解析,高通有自家的工具或者用trace32。
Crash utility 编译
1、 需要安装的工具
sudo apt-get install libaio-dev libncurses5-dev zlib1g-dev liblzma-dev flex bison byacc
2、解压缩包编译
tar zxvf crash-7.0.8.tar.gz
cd crash-7.0.8
make target=ARM
如果需要64bit:
make target=ARM64
3、编译外部lib
make extensions target=ARM64
展讯 ramdump抓取方法
当系统出现kernel panic的时候会自动把ramdump保持在T卡log的 sysdump文件下,一共两个文件:
使用crash utility解析时需要合成一个dump文件才能解析:
cat sysdump.core.0* > dump.bin
Marvell ramdump抓取方法
当系统出现kernel panic的时候会自动进入EMMD dump模式,如果检测到SD card, 屏幕显示“EMMD SD DUMP”,系统会自动把整个memory 保存到sdcard中,然后关机,可以从sdcard中拿到RAMDUMP0000.gz;否则显示“EMMD USB DUMP”,通过USB连接电脑用fastboot 工具将memory dump出来。
Linux
# fastboot-linux-marvell dump dump.bin
Windows:
D:fastboot_windows>fastboot-windows-marvell.exe dump dump.bin
MTK ramdump抓取方法
a.使能ramdump机制
需要添加如下代码
diff --gita/alps/kernel-3.10/drivers/misc/mediatek/aee/mrdump/mrdump_full.cb/alps/kernel-3.10/drivers/misc/mediatek/aee/mrdump/mrdump_full.c
index 8b2b93a..2ec509f 100644
---a/alps/kernel-3.10/drivers/misc/mediatek/aee/mrdump/mrdump_full.c
+++b/alps/kernel-3.10/drivers/misc/mediatek/aee/mrdump/mrdump_full.c
@@ -457,6 +457,17 @@ static int __initmrdump_init(void)
}
atomic_notifier_chain_register(&panic_notifier_list,&mrdump_panic_blk);
+ //add this block
+
+ {
+ mrdump_enable = 1;
+
+ mrdump_plat->hw_enable(mrdump_enable);
+
+ mrdump_cb->machdesc.nr_cpus = NR_CPUS;
+
+ __inner_flush_dcache_all();
+ }
return 0;
}
打开config
+CONFIG_MTK_AEE_POWERKEY_HANG_DETECT=y
+CONFIG_MTK_AEE_MRDUMP=y
+CONFIG_MTK_MRDUMP=y
+CONFIG_MTK_DBG_DUMP=y
另外需要关闭:CONFIG_MTK_AEE_IPANIC,打开了会生成sys_mini_dump,从而不会生成sys_core_dump。
Cat /sys/module/mrdump/parameters/enable 确认是否生效
b.抓取ramdump
Kernel出现panic or oops之后会重启进入lkramdump mode,把ram转储到/data/No_Delete.rdmp,然后在收集到mtklog/aee_exp/db*文件中。通过gat工具导出并把SYS_COREDUMP解析出来即可。
高通ramdump抓取方法
Kernel出现panic or oops之后会重启进入ramdump mode, 然后通过QPST工具把ramdump导出来,高通提供了解析工具linux ramdump parser和crashscope可以进行简单的解析,更复杂的解析需要trace32。
crash utility使用
官方提供了详细的使用文档http://people.redhat.com/anderson/crash_whitepaper,可供参考,下面是一些常用的操作。
1、 进入crash命令行:./crash-arm vmlinux dump.bin
paul@paul-VirtualBox:~$ ./crash-arm vmlinux dump.bin
crash-arm 7.0.5
Copyright (C) 2002-2014 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=i686-pc-linux-gnu --target=arm-elf-linux"...
KERNEL: vmlinux
DUMPFILE: dump.bin
CPUS: 1
DATE: Wed Jan 1 10:26:26 2014
UPTIME: 00:34:14
LOAD AVERAGE: 3.61, 3.59, 3.16
TASKS: 650
NODENAME: localhost
RELEASE: 3.10.33
VERSION: #4 SMP PREEMPT Wed Sep 10 14:44:32 CST 2014
MACHINE: armv7l (unknown Mhz)
MEMORY: 512 MB
PANIC: "c0 4233 (sh) Internal error: Oops: 805 [#1] PREEMPT SMP ARM" (check log for details)
PID: 4233
COMMAND: "sh"
TASK: d37f7b40 [THREAD_INFO: cf512000]
CPU: 0
STATE: TASK_RUNNING (PANIC)
crash-arm>
Crash-arm是编译出来的crash工具二进制文件, dump.bin是抓取到的ramdump,vmlinux和dump.bin的版本必须要要匹配上,否则无法解析。
2、然后在命令行执行log指令获取到kmsg
crash-arm> log
or
crash-arm> log > kmsg
3、bt 获取调用栈,通过调用栈的信息可以恢复现场查找问题。
crash-arm> bt
PID: 37 TASK: db34a640 CPU: 0 COMMAND: "kworker/u8:1"
#0 [<c016ad38>] (try_to_suspend) from [<c0143a5c>]
#1 [<c0143a5c>] (process_one_work) from [<c0144138>]
#2 [<c0144138>] (worker_thread) from [<c0149c94>]
#3 [<c0149c94>] (kthread) from [<c010f498>]
crash-arm> bt -f
PID: 37 TASK: db34a640 CPU: 0 COMMAND: "kworker/u8:1"
#0 [<c016ad38>] (try_to_suspend) from [<c0143a5c>]
[PC: c016ad38 LR: c0143a5c SP: db391ee8 SIZE: 16]
db391ee8: 00000838 c0a5f01c db367080 c0143a5c
#1 [<c0143a5c>] (process_one_work) from [<c0144138>]
[PC: c0143a5c LR: c0144138 SP: db391ef8 SIZE: 56]
db391ef8: c2907600 c0a7be74 00000001 00000000
db391f08: 00000000 db367080 db80ec14 db367098
db391f18: db390000 db390000 c0ab39a3 00000001
db391f28: db80ec00 c0144138
#2 [<c0144138>] (worker_thread) from [<c0149c94>]
[PC: c0144138 LR: c0149c94 SP: db391f30 SIZE: 56]
db391f30: c0144000 00000000 00000000 db390000
db391f40: db391f64 db8b3e98 00000000 db367080
db391f50: c0144000 00000000 00000000 00000000
db391f60: 00000000 c0149c94
#3 [<c0149c94>] (kthread) from [<c010f498>]
[PC: c0149c94 LR: c010f498 SP: db391f68 SIZE: 72]
db391f68: 04000000 00000000 00000000 db367080
db391f78: 00000000 00000000 db391f80 db391f80
db391f88: 00000000 00000000 db391f90 db391f90
db391f98: db391fac db8b3e98 c0149bf0 00000000
db391fa8: 00000000 c010f498
PC program counter,指向当前指向的指令;
LR link register,指向下一条指向的指令;
SP stack pointer,Linux栈的生长方向是由高地址向低地址。
分析下上面红颜色标记的栈数据的含义,首先反汇编vmlinux得到:
static void process_one_work(struct worker *worker, struct work_struct *work)
162360 __releases(&pool->lock)
162361 __acquires(&pool->lock)
162362 {
162363 c0143928: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr}
162364 c014392c: e1a05001 mov r5, r1
162365 c0143930: e5913000 ldr r3, [r1]
162366 c0143934: e24dd014 sub sp, sp, #20
162367 c0143938: e1a04000 mov r4, r0
……
可以看出从后面开始依次是lr, fp, sl, r9, r8, r7, r6, r5, r4,其他的是后来入栈的数据,可以对照汇编查找。
c2907600 c0a7be74 00000001 00000000
00000000 db367080 db80ec14 db367098
db390000 db390000 c0ab39a3 00000001
db80ec00 c0144138
4、struct指令, 通过上面的调用栈信息可以恢复相关的数据,比如struct work_struct。
crash-arm> struct work_struct c0a5f02c
struct work_struct {
data = {
counter = 0
},
entry = {
next = 0x0,
prev = 0xc0a5f034 <autosleep_lock+8>
},
func = 0xc0a5f034 <autosleep_lock+8>
}
5、whatis 获取函数原型
crash-arm> whatis try_to_suspend
void try_to_suspend(struct work_struct *);
6、解析出logcat
加载外部logcat.so
crash-arm> extend logcat.so
crash-arm> logcat
7、help, 更多指令可以输入help查询或http://people.redhat.com/anderson/crash_whitepaper
Case study
1、制造kernel panic可以添加空指针,也可以echo c > /proc/sysrq-trigger。我在代码里做了
如下修改:
+++kernel/power/autosleep.c
@@ -26,12 +30,16 @@
static void try_to_suspend(struct work_struct *work)
{
unsigned int initial_count, final_count;
+ int *p = 0;
if (!pm_get_wakeup_count(&initial_count, true))
goto out;
mutex_lock(&autosleep_lock);
+ if (work->func != NULL)
+ *p = 6;
+
if (!pm_save_wakeup_count(initial_count) ||
当work->func不为NULL(这里只是为了做实验,work->func肯定不会为NULL)时,给指向地址0的指针P赋值导致出现panic。
2、 执行log指令,从解析的kmsg信息中可以定位到出现panic的具体位置
PC is at try_to_suspend+0x38/0xe0
pc : [<c016ad38>]
0x38偏移量, 0xE0是try_to_suspend函数的总长度
1087 [ 82.566833] c0 37 (kworker/u8:1) Unable to handle kernel NULL pointer dereference at virtual address 00000000
1088 [ 82.577697] c0 37 (kworker/u8:1) pgd = c0104000
1089 [ 11.830322] c0 37 (kworker/u8:1) SEH:seh_api_ioctl_handler 6
1090
1091 [ 82.582458] c0 37 (kworker/u8:1) [00000000] *pgd=00000000
1092 [ 82.587860] c0 37 (kworker/u8:1)
1093 [ 82.589965] c0 37 (kworker/u8:1) Internal error: Oops: 805 [#1] PREEMPT SMP ARM
1094 [ 82.597259] c0 37 (kworker/u8:1) Modules linked in: audiostub cidatattydev gs_modem ccinetdev cci_datastub citt y iml_module seh cploaddev msocketk geu galcore(O)
1095 [ 82.610107] c0 37 (kworker/u8:1) CPU: 0 PID: 37 Comm: kworker/u8:1 Tainted: G W O 3.10.33 #51
1096 [ 82.619354] c0 37 (kworker/u8:1) Workqueue: autosleep try_to_suspend
1097 [ 82.623901] c0 37 (kworker/u8:1) task: db34a640 ti: db390000 task.ti: db390000
1098 [ 82.631164] c0 37 (kworker/u8:1) PC is at try_to_suspend+0x38/0xe0
1099 [ 82.637359] c0 37 (kworker/u8:1) LR is at try_to_suspend+0x28/0xe0
1100 [ 82.643585] c0 37 (kworker/u8:1) pc : [<c016ad38>] lr : [<c016ad28>] psr: a00e0013
1101 sp : db391ee8 ip : 00000000 fp : 00000000
1102 [ 82.656921] c0 37 (kworker/u8:1) r10: db2a5400 r9 : 00000000 r8 : db390000
1103 [ 82.664001] c0 37 (kworker/u8:1) r7 : db80ec00 r6 : c0ab3d34 r5 : c0a5f01c r4 : c0a5f01c
1104 [ 82.672393] c0 37 (kworker/u8:1) r3 : 00000000 r2 : 00000006 r1 : 200e0013 r0 : c0a5f02c
1105 [ 82.680755] c0 37 (kworker/u8:1) Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel
3、反汇编vmlinux
arm-linux-androideabi-objdump -C -S vmlinux > vmlinux-dump
通过地址c016ad38可以查找到是执行下面这条指令出现了panic,从kmsg可以得知r3 : 00000000、r2 : 00000006,向地址0x0赋值肯定是非法的。
272190 c016ad38: 15832000 strne r2, [r3]
执行*p = 6的条件是work->func != NULL,R0寄存器的值是try_to_suspend()函数的参数struct work_struct *。R0~R3为什么被用来装载函数参数,可以搜索下APCS标准。
if (work->func != NULL)
*p = 6;
执行 struct work_struct c0a5f02c 可以恢复当时的struct work_struct,可以清楚看到work->func并不为NULL。
crash-arm> struct work_struct c0a5f02c
struct work_struct {
data = {
counter = 0
},
entry = {
next = 0x0,
prev = 0xc0a5f034 <autosleep_lock+8>
},
func = 0xc0a5f034 <autosleep_lock+8>
}
上面只是给出一个简单的例子用作学习,实际调试过程中遇到的panic肯定不会像例子这么简单。
参考:
http://blog.csdn.net/keyboardota/article/details/6799054
http://people.redhat.com/anderson/crash_whitepaper
- Redhat Crash Utility-Ramdump
- Crash加载ramdump报cpu_possible_mask异常问题
- White Paper: Red Hat Crash Utility
- utility
- 高通抓取ramdump
- redhat-kernel-kdump-crash----内核dump文件分析
- Trace32 加载高通 ramdump
- crash
- CRASH
- crash
- Crash
- crash
- 高通 MSM8K bootloader 之四: ramdump
- 为python添加环境变量(ramdump parser)
- 高通平台如何抓RAMDUMP
- 高通 MSM8K bootloader 之四: ramdump
- 高通 MSM8K bootloader 之四: ramdump
- 高通平台 ramdump-parser 简介
- [Java 8] (8) Lambda表达式对递归的优化(上) - 使用尾递归
- 15个值得开发人员关注的jQuery开发技巧和心得
- eclipse配置Tomcat,项目过大的一些设置
- 人工智能创造与设计的典范——美大学教授发明人工智能"作曲家"媲美莫扎特
- 模板模板参数
- Redhat Crash Utility-Ramdump
- freeswitch sip trunk
- 点格棋1
- 将十六进制颜色转换为int类型的数值
- VC,MFC开发技巧收集
- Linux中vim下方向键变成ABCD,backspace无法删除字符的解决办法
- Android LayoutInflater深度解析 给你带来全新的认识(包含View的onMeasure方法)
- 分析工具库1——分析工具库概述
- 大数据时代之hadoop(五):hadoop 分布式计算框架(MapReduce)