Linux kernel crash analysis example

来源:互联网 发布:php crypt 编辑:程序博客网 时间:2024/05/29 08:33

Issue reported:

When USB is connected as Mass Storage mode, copy file from external sdcard to clipboard.

The disconnect USB and try to paste clipboard file into internal sdcard but it will fail.

Reconnect USB and target crash several minutes later.


Crash Context:

<4>[32284.249267] C0 [      swapper/0] CPU: 0    Tainted: G        W     (3.4.5 #1)<4>[32284.256499] C0 [      swapper/0] PC is at DWC_WORKQ_SCHEDULE+0xd4/0x108<4>[32284.263152] C0 [      swapper/0] LR is at DWC_WORKQ_SCHEDULE+0x94/0x108<4>[32284.269866] C0 [      swapper/0] pc : [<c0332fc4>]    lr : [<c0332f84>]    psr: 800001d3<4>[32284.269897] C0 [      swapper/0] sp : c0947dc0  ip : dc050000  fp : 00000000<4>[32284.285064] C0 [      swapper/0] r10: 00000000  r9 : 00000000  r8 : db376e04<4>[32284.292174] C0 [      swapper/0] r7 : db370000  r6 : c0318050  r5 : db2d6a40  r4 : d972e140<4>[32284.300597] C0 [      swapper/0] r3 : 00000000  r2 : caa91840  r1 : d972e158  r0 : 0000007f<4>[32284.309020] C0 [      swapper/0] Flags: Nzcv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment kernel<4>[32284.318389] C0 [      swapper/0] Control: 10c53c7d  Table: 8291806a  DAC: 00000015<4>[32284.326018] C0 [      swapper/0] <4>[32284.326049] C0 [      swapper/0] PC: 0xc0332f44:<4>[32284.334014] C0 [      swapper/0] 2f44  e595000c ebfffc5c e3a00044 ebffff3d e1a04000 e88000c0 e5805008 e59f007c<4>[32284.344451] C0 [      swapper/0] 2f64  ebffff3a e59f1078 e59f2078 e584000c e1a03000 e59f0070 e58d4000 ebfffd7b<4>[32284.354858] C0 [      swapper/0] 2f84  e3a03c05 e5843018 e284301c e584301c e5843020 e2841018 e59f3050 e5843024<4>[32284.365264] C0 [      swapper/0] 2fa4  e2853010 e5843010 e5952014 e5842014 e5952010 e1520003 05854010 15953014<4>[32284.375671] C0 [      swapper/0] 2fc4  15834010 e5854014 e5950000 ebf4e615 e28dd018 e8bd40f0 e28dd004 e12fff1e<4>[32284.386077] C0 [      swapper/0] 2fe4  c0a71d34 c063fef4 c076b31a c07a722f c03326a8 e92d4038 e1a04000 e59f501c<4>[32284.396484] C0 [      swapper/0] 3004  ea000003 e5953004 e2444001 e59f0010 e12fff33 e3540000 1afffff9 e8bd8038<4>[32284.406890] C0 [      swapper/0] 3024  c09944e0 066665b0 e92d4008 e59f3008 e5933008 e12fff33 e8bd8008 c09944e0<4>[32284.417358] C0 [      swapper/0] <4>[32284.417358] C0 [      swapper/0] LR: 0xc0332f04:<4>[32284.425353] C0 [      swapper/0] 2f04  e1a06001 e1a07002 e3a01080 e59d202c e59f00c8 e58d300c ebfc2336 e5950008<4>[32284.435790] C0 [      swapper/0] 2f24  e28d1010 ebfffd28 e5953004 e59d1010 e2833001 e5950008 e5853004 eb0b2c0d<4>[32284.446197] C0 [      swapper/0] 2f44  e595000c ebfffc5c e3a00044 ebffff3d e1a04000 e88000c0 e5805008 e59f007c<4>[32284.456634] C0 [      swapper/0] 2f64  ebffff3a e59f1078 e59f2078 e584000c e1a03000 e59f0070 e58d4000 ebfffd7b<4>[32284.467040] C0 [      swapper/0] 2f84  e3a03c05 e5843018 e284301c e584301c e5843020 e2841018 e59f3050 e5843024<4>[32284.477477] C0 [      swapper/0] 2fa4  e2853010 e5843010 e5952014 e5842014 e5952010 e1520003 05854010 15953014<4>[32284.487884] C0 [      swapper/0] 2fc4  15834010 e5854014 e5950000 ebf4e615 e28dd018 e8bd40f0 e28dd004 e12fff1e<4>[32284.498321] C0 [      swapper/0] 2fe4  c0a71d34 c063fef4 c076b31a c07a722f c03326a8 e92d4038 e1a04000 e59f501c

System Triage Procedure:

1) Find call stack and locate the DWC_WORKQ_SCHEDULE() API

2) Get the assembly code via objdump for offending API

3) ARM assembly code is listed

EXPORT_SYMBOL(DWC_WORKQ_FREE);void DWC_WORKQ_SCHEDULE(dwc_workq_t *wq, dwc_work_callback_t work_cb,void *data, char *format, ...){    107c:e52d3004 push{r3}; (str r3, [sp, #-4]!)    1080:e92d40f0 push{r4, r5, r6, r7, lr}// 0x107c + 0xD4 offset = > 0x1150    1084:e24dd018 subsp, sp, #24    1088:e1a05000 movr5, r0int64_t flags;work_container_t *container;static char name[128];va_list args;va_start(args, format);    108c:e28d3030 addr3, sp, #48; 0x30

4) Check offset=0x1150 code

#ifdef DEBUGDWC_CIRCLEQ_INSERT_TAIL(&wq->entries, container, entry);    1130:e2853010 addr3, r5, #16    1134:e5843010 strr3, [r4, #16]    1138:e5952014 ldrr2, [r5, #20]    113c:e5842014 strr2, [r4, #20]    1140:e5952010 ldrr2, [r5, #16]    1144:e1520003 cmpr2, r3    1148:05854010 streqr4, [r5, #16]    114c:15953014 ldrner3, [r5, #20]    1150:15834010 strner4, [r3, #16]// 0xc0332f44      1154:e5854014 strr4, [r5, #20]#endif

5) Check ARM instruction against crash context

<4>[32284.375671] C0 [      swapper/0] 2fc4  15834010 e5854014 e5950000 ebf4e615 e28dd018 e8bd40f0 e28dd004 e12fff1e

6) Now we can conclude that null pointer is caused in offset=0x1150.

7) Check against source c code DWC_CIRCLEQ_INSERT_TAIL()

#define DWC_CIRCLEQ_INSERT_TAIL(head, elm, field) do {            \    (elm)->field.cqe_next = DWC_CIRCLEQ_END(head);            \    (elm)->field.cqe_prev = (head)->cqh_last;            \    if ((head)->cqh_first == DWC_CIRCLEQ_END(head))            \        (head)->cqh_first = (elm);                \    else                                \        (head)->cqh_last->field.cqe_next = (elm);    \    (head)->cqh_last = (elm);    \} while (0)

8) Analyze Crash Context

r3 : 00000000

So we know R3=(head)->cqh_last and it is NULL pointer.

9) Add protection code for NULL pointer.