[知其然不知其所以然-22] What's inside the pm_trace

来源:互联网 发布:莅阳长公主知乎 编辑:程序博客网 时间:2024/06/06 20:03
static int device_resume_noirq(struct device *dev, pm_message_t state, bool async){TRACE_DEVICE(dev);TRACE_RESUME(0);

We've  seen a lot of similar trace code added inside  different suspend/resume phases, this is a track util

to help user find out which driver breaks the resume process. The key method here is to

store the __FUNCTION__ __LINE__ into a stable tsc register, thus the next boot up

we can check the datas to find out what is the last broken code.

For TRACE_RESUME(0), it is defined here:

#define TRACE_RESUME(user)\do {\if (pm_trace_enabled) {\const void *tracedata;\asm volatile(_ASM_MOV " $1f,%0\n"\     ".section .tracedata,\"a\"\n"\     "1:\t.word %c1\n\t"\     _ASM_PTR " %c2\n"\     ".previous"\     :"=r" (tracedata)\     : "i" (__LINE__), "i" (__FILE__));\generate_pm_trace(tracedata, user);\}\} while (0)

so what TRACE_RESUME does is to put the __LINE__ and __FILE__ in a section

named .tracedata, and pass the address of this section .tracedata to generate_pm_trace:

void generate_pm_trace(const void *tracedata, unsigned int user){unsigned short lineno = *(unsigned short *)tracedata;const char *file = *(const char **)(tracedata + 2);unsigned int user_hash_value, file_hash_value;user_hash_value = user % USERHASH;file_hash_value = hash_string(lineno, file, FILEHASH);set_magic_time(user_hash_value, file_hash_value, dev_hash_value);}
At last the lineno and file are written to tsc register.

So apparently we should check what the hash is after we reboot by holding power button.

Since the hash can not be reversed to lineno or file, we should must find a way

to match the lineno and file. So how to achive it? The magic laid on above TRACE_RESUME macro,

the most important point is that:

static int device_resume_noirq(struct device *dev, pm_message_t state, bool async){pm_callback_t callback = NULL;char *info = NULL;int error = 0;TRACE_DEVICE(dev);TRACE_RESUME(0);

The address of lineno and file has already been put into .section .tracedata once the kernel has

been compiled. So the data  is static, while the command of mov is  dynamic, although these values

would not show up in Systemmap.

ffffffff81b427d8 R __tracedata_startffffffff81b42850 R __ksymtab_IO_APIC_get_PCI_irq_vectorffffffff81b42850 R __start___ksymtabffffffff81b42850 R __tracedata_end


lineno and file in the tracedata section already, and only when TRACE_RESUME(0) is actually invoked,

this pair of lineno and file would be wrotten to TSC register by :

_ASM_MOV " $1f,%0
So after reboot, we can compare the TSC hash with current kernel's tracedata section, to find out which

file and which lineno is the last pair:

for (tracedata = &__tracedata_start ; tracedata < &__tracedata_end ;tracedata += 2 + sizeof(unsigned long)) {unsigned short lineno = *(unsigned short *)tracedata;const char *file = *(const char **)(tracedata + 2);unsigned int hash = hash_string(lineno, file, FILEHASH);if (hash != value)continue;pr_info("  hash matches %s:%u\n", file, lineno);match++;}


And for TRACE_DEVICE:

dev_hash_value = hash_string(DEVSEED, dev_name(dev), DEVHASH);
we know besides lineno and file, we also put dev info into the TSC:

set_magic_time(user_hash_value, file_hash_value, dev_hash_value);

So, the final TSC value is combiled with lineno/file/device,

 *  - 0-15: user-settable *  - 0-996: file + line number *  - 0-1008: device

Then in late_resume_init, these three values will be extracted

and finally searched by show_file_hash(file); and show_dev_hash(dev);

We have described show_file_hash so we verify show_dev_hash,

since the device info is not embedded in the tracedata section, we

check them directly by traversing dpm_list:

static int show_dev_hash(unsigned int value){int match = 0;struct list_head *entry;device_pm_lock();entry = dpm_list.prev;while (entry != &dpm_list) {struct device * dev = to_device(entry);unsigned int hash = hash_string(DEVSEED, dev_name(dev), DEVHASH);if (hash == value) {dev_info(dev, "hash matches\n");match++;}entry = entry->prev;}device_pm_unlock();return match;}

And one more think to emphasize, we also put TRACE_RESUME(error);

after each device callback was invoked, so if error is not zero, means there is a error

code, 

Here's an example for pm_trace during hibernation:

[    1.355199] calling  late_resume_init+0x0/0x1a0 @ 1[    1.356024]   Magic number: 0:212:178[    1.356841]   hash matches arch/x86/power/hibernate_64.c:101[    1.357724] acpi device:0e: hash matches[    1.358559]  platform: hash matches

Since 

pr_info("  Magic number: %d:%d:%d\n", user, file, dev);
The user is zero, which means, either 

/sys/bus/acpi/devices/device\:0e/
has hang on hibernate_64.c:101,( Why is this file??? I can not find any TRACE_SUSPEND in this file!)

or it has successfully pass the suspend and set error to zero, because the framework

looks like:

static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool async){pm_callback_t callback = NULL;char *info = NULL;int error = 0;TRACE_DEVICE(dev);TRACE_SUSPEND(0);        dpm_run_callback(callback, dev, state, info);        TRACE_SUSPEND(error);}
So this is actually a problem, we don't know whether we succeed or just hang there.

BTW the user filed could be 0~15, we should assign one special initial value for them,

but what value? That is a problem...







0 0
原创粉丝点击