调试死锁(deadlock)的方法
来源:互联网 发布:mac os 10.9 iso镜像 编辑:程序博客网 时间:2024/06/08 08:41
如果程序运动不正常的时候,可以利用pstack看一下程序当然的状态,多次执行如下:
[tangliang]$ pstack 31859Thread 3 (Thread 0x7f69b59d2700 (LWP 31860)):#0 0x000000380220e264 in __lll_lock_wait () from /lib64/libpthread.so.0#1 0x0000003802209508 in _L_lock_854 () from /lib64/libpthread.so.0#2 0x00000038022093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0#3 0x000000000040097e in writeTest(void*) ()#4 0x00000038022079d1 in start_thread () from /lib64/libpthread.so.0#5 0x0000003801ae89dd in clone () from /lib64/libc.so.6Thread 2 (Thread 0x7f69b4fd1700 (LWP 31861)):#0 0x000000380220e264 in __lll_lock_wait () from /lib64/libpthread.so.0#1 0x0000003802209508 in _L_lock_854 () from /lib64/libpthread.so.0#2 0x00000038022093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0#3 0x0000000000400a2c in readTest(void*) ()#4 0x00000038022079d1 in start_thread () from /lib64/libpthread.so.0#5 0x0000003801ae89dd in clone () from /lib64/libc.so.6Thread 1 (Thread 0x7f69b59d4720 (LWP 31859)):#0 0x000000380220822d in pthread_join () from /lib64/libpthread.so.0#1 0x0000000000400b13 in main ()[tangliang]$ [tangliang]$ [tangliang]$ pstack 31859Thread 3 (Thread 0x7f69b59d2700 (LWP 31860)):#0 0x000000380220e264 in __lll_lock_wait () from /lib64/libpthread.so.0#1 0x0000003802209508 in _L_lock_854 () from /lib64/libpthread.so.0#2 0x00000038022093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0#3 0x000000000040097e in writeTest(void*) ()#4 0x00000038022079d1 in start_thread () from /lib64/libpthread.so.0#5 0x0000003801ae89dd in clone () from /lib64/libc.so.6Thread 2 (Thread 0x7f69b4fd1700 (LWP 31861)):#0 0x000000380220e264 in __lll_lock_wait () from /lib64/libpthread.so.0#1 0x0000003802209508 in _L_lock_854 () from /lib64/libpthread.so.0#2 0x00000038022093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0#3 0x0000000000400a2c in readTest(void*) ()#4 0x00000038022079d1 in start_thread () from /lib64/libpthread.so.0#5 0x0000003801ae89dd in clone () from /lib64/libc.so.6Thread 1 (Thread 0x7f69b59d4720 (LWP 31859)):#0 0x000000380220822d in pthread_join () from /lib64/libpthread.so.0#1 0x0000000000400b13 in main ()
发现有两个线程的栈有在pthread_mutex_lock (),怀疑可能发生死锁。利用gdb做一些验证吧。
gdb进入当前的进程:gdb -p 31859之后执行
(gdb) thread apply all btThread 3 (Thread 0x7f69b59d2700 (LWP 31860)):#0 0x000000380220e264 in __lll_lock_wait () from /lib64/libpthread.so.0#1 0x0000003802209508 in _L_lock_854 () from /lib64/libpthread.so.0#2 0x00000038022093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0#3 0x000000000040097e in writeTest (temp=0x0) at dead_lock_two_thread.cc:15#4 0x00000038022079d1 in start_thread () from /lib64/libpthread.so.0#5 0x0000003801ae89dd in clone () from /lib64/libc.so.6Thread 2 (Thread 0x7f69b4fd1700 (LWP 31861)):#0 0x000000380220e264 in __lll_lock_wait () from /lib64/libpthread.so.0#1 0x0000003802209508 in _L_lock_854 () from /lib64/libpthread.so.0#2 0x00000038022093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0#3 0x0000000000400a2c in readTest (temp=0x0) at dead_lock_two_thread.cc:35#4 0x00000038022079d1 in start_thread () from /lib64/libpthread.so.0#5 0x0000003801ae89dd in clone () from /lib64/libc.so.6Thread 1 (Thread 0x7f69b59d4720 (LWP 31859)):#0 0x000000380220822d in pthread_join () from /lib64/libpthread.so.0#1 0x0000000000400b13 in main () at dead_lock_two_thread.cc:60
进入线程3:
(gdb) t 3[Switching to thread 3 (Thread 0x7f69b59d2700 (LWP 31860))]#0 0x000000380220e264 in __lll_lock_wait () from /lib64/libpthread.so.0(gdb) f 2#2 0x00000038022093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0(gdb) i rrax 0xfffffffffffffe00-512rbx 0x00rcx 0xffffffffffffffff-1rdx 0x00rsi 0x80128rdi 0x6012c06296256rbp 0x7f69b59d1e900x7f69b59d1e90rsp 0x7f69b59d1e580x7f69b59d1e58r8 0x6012c06296256r9 0x7c7431860r10 0x88r11 0x202514r12 0x380241c360240556032864r13 0x7f69b59d29c0140091995269568r14 0x00r15 0x33rip 0x38022093d70x38022093d7 <pthread_mutex_lock+55>eflags 0x202[ IF ]cs 0x3351ss 0x2b43ds 0x00es 0x00fs 0x00gs 0x00(gdb) p *(pthread_mutex_t *)0x6012c0$1 = {__data = {__lock = 2, __count = 0, __owner = 31861, __nusers = 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000u|\000\000\001", '\000' <repeats 26 times>, __align = 2}(gdb) set print pretty on(gdb) p *(pthread_mutex_t *)0x6012c0$2 = { __data = { __lock = 2, __count = 0, __owner = 31861, __nusers = 1, __kind = 0, __spins = 0, __list = { __prev = 0x0, __next = 0x0 } }, __size = "\002\000\000\000\000\000\000\000u|\000\000\001", '\000' <repeats 26 times>, __align = 2}
之后看一下pthread_mutex_lock对应的参数。因为运行在linux下面的x86_64机器上面,所以函数的第一个参数放到rdi这个寄存器中。
另外,不同的操作系统,存放函数的位置是有差异的。差异见下图。
发现当前线程3(线程id: 31860)锁的owner是31861线程。
进入线程2(线程id: 31861):
(gdb) t 2[Switching to thread 2 (Thread 0x7f69b4fd1700 (LWP 31861))]#0 0x000000380220e264 in __lll_lock_wait () from /lib64/libpthread.so.0(gdb) f 2#2 0x00000038022093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0(gdb) i rrax 0xfffffffffffffe00-512rbx 0x00rcx 0xffffffffffffffff-1rdx 0x00rsi 0x80128rdi 0x6013006296320rbp 0x7f69b4fd0e900x7f69b4fd0e90rsp 0x7f69b4fd0e580x7f69b4fd0e58r8 0x6013006296320r9 0x7c7531861r10 0x88r11 0x202514r12 0x380241c360240556032864r13 0x7f69b4fd19c0140091984779712r14 0x00r15 0x33rip 0x38022093d70x38022093d7 <pthread_mutex_lock+55>eflags 0x202[ IF ]cs 0x3351ss 0x2b43ds 0x00es 0x00fs 0x00gs 0x00(gdb) p *(pthread_mutex_t *)0x601300$3 = { __data = { __lock = 2, __count = 0, __owner = 31860, __nusers = 1, __kind = 0, __spins = 0, __list = { __prev = 0x0, __next = 0x0 } }, __size = "\002\000\000\000\000\000\000\000t|\000\000\001", '\000' <repeats 26 times>, __align = 2}发现线程2(31861)的锁的owner是31861。
发生了死锁。
参考:
debugging hacks -- 深入调试的技术和工具
https://en.wikipedia.org/wiki/X86_calling_conventions#List_of_x86_calling_conventions
阅读全文
0 0
- 调试死锁(deadlock)的方法
- 死锁的调试方法
- Deadlock的一些总结(死锁分析及处理)
- 操作系统 - 死锁(Deadlock)的概述、条件、对策
- (十五)java并发编程--线程的死锁(deadlock)
- 死锁(deadlock)
- Deadlock --- 死锁
- 死锁deadlock
- SQL Server上的一个奇怪的Deadlock及其分析方法 sql server 死锁
- Win32多线程之死锁(DeadLock)
- 使用 Task.Wait()?立刻死锁(deadlock)
- 又踩.NET Core的坑:在同步方法中调用异步方法Wait时发生死锁(deadlock)
- 死锁问题(Deadlock)
- MySql数据库死锁deadlock
- deadlock linux死锁
- 死锁(deadlock)例题解析
- 两种调试死锁的方法
- iis应用程序中报数据库死锁"deadlock"的问题
- 重装Mysql删除方法和安装步骤
- Python time 模块详解
- 模拟退火算法(SA,Simulated Annealing)
- java中的异常
- 分布式系统延迟和容错框架Hystrix
- 调试死锁(deadlock)的方法
- ssm框架下 生产环境和开发环境切换 初级版(后期改成springboot爽得多)
- ActiveMq的使用
- CodeForces
- Opengl---gluLookAt函数详解
- jQuery解决冲突,及选择器
- 解决js异步问题的方法--async和await(ES7)
- css基础总汇
- 响应式布局的实现