关于使用Windbg查看线程死锁问题

来源:互联网 发布:linux内核架构 编辑:程序博客网 时间:2024/05/16 05:57
最近项目接近尾声,不过项目到了测试的时候大问题来了,偶尔界面直接卡死以至于后续无法测试,初步怀疑是哪里死锁了,由于自己对WinDebug不是很熟悉,只知道WinDebug有查找内存泄露问题、线程死锁等相关功能,于是吩咐下面的人用winDebug去查找问题,索性网络资源比较多,通过一番调研找到了WInDebug死锁查找的解决方法,不过作为项目开发经理,遇到死锁的事应该能够在其他同时不能解决的时候也能提供相应的帮助是理所当然的,于是带着这种心态去学习了WinDebug查找死锁的方法,总结如下:

死锁,成立的条件就是:

(1)起码有两把锁以上,假如仅有两把锁,且为锁A和锁B;

(2)线程1已经拿了一把锁A然后还想去拿锁B;线程2已经拿了一把锁B然后还想去拿锁A;

(3)没有拿到另一把锁不强行释放自己获取的锁;


于是,死锁来了~

实现所有的实例:

(1)初始化两把锁A和B,启动两个线程(可以一个主线程和一个子线程);

(2)第一个线程已经拿了锁A,第二个线程已经拿了锁B,第一线程尝试去拿锁B,第二线程尝试去拿锁A;

基于以上思想,我写了一个Demon,出现了死锁情况,然后使用WinDebug查找死锁情况,方法介绍如下:

(1)启用应用程序的用户堆栈功能-我使用winDebug开始没有启用导致WinDebug绑定进程失败;

          方法使用WinDebug目录下的gflags.exe:打开命令行,进入WinDebug目录运行:gflag.exe /i  调试exe的全路径 +ust 

          然后回车,ust 就是 user stack用户堆栈;

(2)打开WinDebug然后选择File-->Attach to Process 附加到需要调试的进程exe;

          如果需要查看所有线程堆栈那么在命令窗口输入:~*kv

输出如下所有线程堆栈:

          此时能打开所有线程的堆栈,如果需要看某一线程的堆栈,则输入*1kv也就是打印线程1的堆栈;从所有线程的堆栈中我们可以看到各个线程的堆栈信息,如果此时有堆栈锁,一般都在栈顶会有API调用:ntdll!RtlpWaitOnCriticalSection,也就是找到所有的该API调用我们就可以找到对应的死锁线程信息;

  0  Id: 11264.10f20 Suspend: 1 Teb: 7efdd000 UnfrozenChildEBP RetAddr  Args to Child              003eecd8 77709e2e 00000124 00000000 00000000 ntdll!ZwWaitForSingleObject+0x15 (FPO: [3,0,0])003eed3c 77709d12 00000000 00000000 00000001 ntdll!RtlpWaitOnCriticalSection+0x13e (FPO: [Non-Fpo])003eed64 0113e289 0161cd80 c3c8e76a 00000001 ntdll!RtlEnterCriticalSection+0x150 (FPO: [Non-Fpo])003eee7c 011548ff 003ef808 003eeeb4 757d62fa LockDemon!CLockDemonDlg::OnInitDialog+0x179 (FPO: [Non-Fpo]) (CONV: thiscall) [e:\work\c++\test\lockdemon\lockdemon\lockdemondlg.cpp @ 122]003eee88 757d62fa 002d1984 00000110 001d1b26 LockDemon!AfxDlgProc+0x3f (CONV: stdcall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\dlgcore.cpp @ 28]003eeeb4 757ff9df 01119035 002d1984 00000110 USER32!InternalCallWinProc+0x23003eef30 757ff784 00000000 01119035 002d1984 USER32!UserCallDlgProcCheckWow+0xd7 (FPO: [Non-Fpo])003eef80 757ff889 0409a140 00000000 00000110 USER32!DefDlgProcWorker+0xb7 (FPO: [Non-Fpo])003eefa0 757d62fa 002d1984 00000110 001d1b26 USER32!DefDlgProcW+0x29 (FPO: [Non-Fpo])003eefcc 757d6d3a 77734308 002d1984 00000110 USER32!InternalCallWinProc+0x23003ef044 757e0d27 00000000 77734308 002d1984 USER32!UserCallWinProcCheckWow+0x109 (FPO: [Non-Fpo])003ef07c 757e0d4d 77734308 002d1984 00000110 USER32!CallWindowProcAorW+0xab (FPO: [Non-Fpo])003ef09c 0115e4d4 77734308 002d1984 00000110 USER32!CallWindowProcW+0x1b (FPO: [Non-Fpo])003ef0c0 0115c5fd 00000110 001d1b26 00000000 LockDemon!CWnd::DefWindowProcW+0x34 (CONV: thiscall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp @ 1043]003ef0dc 01156235 003ef808 003ef0f8 012282d3 LockDemon!CWnd::Default+0x3d (CONV: thiscall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp @ 274]003ef100 011603e5 001d1b26 00000000 c3c8fb46 LockDemon!CDialog::HandleInitDialog+0xd5 (CONV: thiscall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\dlgcore.cpp @ 673]003ef250 0115fb62 00000110 001d1b26 00000000 LockDemon!CWnd::OnWndMsg+0x835 (CONV: thiscall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp @ 2018]003ef270 0115c400 00000110 001d1b26 00000000 LockDemon!CWnd::WindowProc+0x32 (CONV: thiscall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp @ 1755]003ef2ec 0115cb16 003ef808 002d1984 00000110 LockDemon!AfxCallWndProc+0xf0 (CONV: stdcall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp @ 240]003ef30c 757d62fa 002d1984 00000110 001d1b26 LockDemon!AfxWndProc+0xa6 (CONV: stdcall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\wincore.cpp @ 403]003ef338 757d6d3a 01127310 002d1984 00000110 USER32!InternalCallWinProc+0x23003ef3b0 757d6de8 00000000 01127310 002d1984 USER32!UserCallWinProcCheckWow+0x109 (FPO: [Non-Fpo])003ef40c 757d6e44 0409a140 00000000 00000110 USER32!DispatchClientMessage+0xe0 (FPO: [Non-Fpo])003ef448 776e010a 003ef460 00000000 003ef694 USER32!__fnDWORD+0x2b (FPO: [Non-Fpo])003ef45c 0409a140 00000000 00000110 001d1b26 ntdll!KiUserCallbackDispatcher+0x2e (FPO: [0,0,0])WARNING: Frame IP not in any known module. Following frames may be wrong.003ef4c0 7580206f 0409a140 00000000 01127310 0x409a140003ef594 758010d3 00f00000 00000006 000000a4 USER32!InternalCreateDialog+0xb9f (FPO: [Non-Fpo])003ef5b8 757ec659 00f00000 0163fdc8 00000000 USER32!CreateDialogIndirectParamAorW+0x33 (FPO: [Non-Fpo])003ef5d8 01155513 00f00000 0163fdc8 00000000 USER32!CreateDialogIndirectParamW+0x1b (FPO: [Non-Fpo])003ef6a0 01155e39 0163fdc8 00000000 00f00000 LockDemon!CWnd::CreateDlgIndirect+0x263 (CONV: thiscall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\dlgcore.cpp @ 312]003ef714 0113da0d c3c8f1a6 00000000 00000000 LockDemon!CDialog::DoModal+0x199 (CONV: thiscall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\dlgcore.cpp @ 576]003ef8b0 0153da94 757d9ee1 003ef8c0 00280026 LockDemon!CLockDemonApp::InitInstance+0xad (CONV: thiscall) [e:\work\c++\test\lockdemon\lockdemon\lockdemon.cpp @ 64]003ef8d4 0153d98a 00f00000 00000000 006710b4 LockDemon!AfxWinMain+0x84 (CONV: stdcall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\winmain.cpp @ 37]003ef8ec 014c1015 00f00000 00000000 006710b4 LockDemon!wWinMain+0x1a (CONV: stdcall) [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\appmodul.cpp @ 34]003ef990 014c0e9f 003ef9a4 76e0336a 7efde000 LockDemon!__tmainCRTStartup+0x165 (CONV: cdecl) [f:\dd\vctools\crt_bld\self_x86\crt\src\crt0.c @ 263]003ef998 76e0336a 7efde000 003ef9e4 777092b2 LockDemon!wWinMainCRTStartup+0xf (CONV: cdecl) [f:\dd\vctools\crt_bld\self_x86\crt\src\crt0.c @ 182]003ef9a4 777092b2 7efde000 61654582 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])003ef9e4 77709285 0111a958 7efde000 00000000 ntdll!__RtlUserThreadStart+0x70 (FPO: [Non-Fpo])003ef9fc 00000000 0111a958 7efde000 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])   1  Id: 11264.11284Suspend: 1 Teb: 7efda000 UnfrozenChildEBP RetAddr  Args to Child              0631fe7c 759714ab 00000104 00000000 00000000 ntdll!ZwWaitForSingleObject+0x15 (FPO: [3,0,0])0631fee8 76e01194 00000104 ffffffff 00000000 KERNELBASE!WaitForSingleObjectEx+0x98 (FPO:[Non-Fpo])0631ff00 76e01148 00000104 ffffffff 00000000 kernel32!WaitForSingleObjectExImplementation+0x75 (FPO: [Non-Fpo])0631ff14 7235107b 00000104 ffffffff 76e010ff kernel32!WaitForSingleObject+0x12 (FPO: [Non-Fpo])WARNING: Stack unwind information not available. Following frames may be wrong.0631ff58 7235290a 72377db8 76e0336a 72377db8 CKSee+0x107b0631ff6c 777092b2 72377db8 676a43ca 00000000 CKSee!Kinkoo_GetInterface+0x13aa0631ffac 77709285 72352900 72377db8 00000000 ntdll!__RtlUserThreadStart+0x70 (FPO: [Non-Fpo])0631ffc4 00000000 72352900 72377db8 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])   2  Id: 11264.9238 Suspend: 1 Teb: 7efd7000 UnfrozenChildEBP RetAddr  Args to Child              064efbf4 77709e2e 00000120 00000000 00000000 ntdll!ZwWaitForSingleObject+0x15 (FPO: [3,0,0])064efc58 77709d12 00000000 00000000 005819a0 ntdll!RtlpWaitOnCriticalSection+0x13e (FPO: [Non-Fpo])064efc80 01140f78 0161cd68 00000000 00000000 ntdll!RtlEnterCriticalSection+0x150 (FPO: [Non-Fpo])064efd58 014c22e3 003ef808 c5b8f482 00000000 LockDemon!MyFunc+0x48 (FPO: [Non-Fpo]) (CONV: stdcall) [e:\work\c++\test\lockdemon\lockdemon\lockdemondlg.cpp @ 79]064efd94 014c2254 00000000 064efdac 76e0336a LockDemon!_callthreadstartex+0x53 (CONV: cdecl) [f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c @ 348]064efda0 76e0336a 005819a0 064efdec 777092b2 LockDemon!_threadstartex+0xa4 (CONV: stdcall) [f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c @ 331]064efdac 777092b2 005819a0 6715418a 00000000 kernel32!BaseTThreadInitThunk+0xe (FPO: [Non-Fpo])064efdec 77709285 014c21b0 005819a0 00000000 ntdll!__RtlUserThreadStart+0x70 (FPO: [Non-Fpo])064efe04 00000000 014c21b0 005819a0 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])#  3  Id: 11264.1175c Suspend: 1 Teb: 7ef9f000 UnfrozenChildEBP RetAddr  Args to Child              0580fcf8 7776fb96 64db414e 00000000 00000000 ntdll!DbgBreakPoint (FPO: [0,0,0])0580fd28 76e0336a 00000000 0580fd74 777092b2 ntdll!DbgUiRemoteBreakin+0x3c (FPO: [Non-Fpo])0580fd34 777092b2 00000000 64db4112 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])0580fd74 77709285 7776fb5a 00000000 00000000 ntdll!__RtlUserThreadStart+0x70 (FPO: [Non-Fpo])0580fd8c 00000000 7776fb5a 00000000 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])
由此可以看出总共有0-3号4个线程,绿色的表示线程线程索引号,蓝色的表示线程的地址(11264.1175c中11264位线程ID,1175c为线程地址),当然所有线程打印出来一个一个找锁不容易观察到死锁,为了能直接打印进程中线程之间的死锁,直接使用如下命令


首先,查看进程中所有的锁,输入命令 :  !locks,如下:
0:003> !locks


此时打印出进程中所有的锁信息:

CritSec LockDemon!g_LockA+0 at 0161cd80WaiterWoken        NoLockCount          1RecursionCount     1OwningThread       9238EntryCount         0ContentionCount    1*** LockedCritSec LockDemon!g_LockB+0 at 0161cd68WaiterWoken        NoLockCount          1RecursionCount     1OwningThread       10f20EntryCount         0ContentionCount    1*** Locked

第一把锁地址为0161cd80,LockCount表示占有该锁的数目,RecursionCount  表示拥有者线程进入该锁的次数(进入拿到锁后还可以进入多次),OwningThread 拥有者线程为9238,也就是说0161cd80锁被地址为9238线程占用,也就是线程索引号为2 ,线程ID为11264线程地址为9238 的线程已经拿了锁0161cd80,却在等待锁 0161cd68 

第二把锁地址为0161cd68,LockCount表示占有该锁的数目,RecursionCount  表示拥有者线程进入该锁的次数(进入拿到锁后还可以进入多次),OwningThread 拥有者线程为10f20,也就是说0161cd68锁被地址为10f20线程占用,也就是线程索引号为0 ,线程ID为11264线程地址为10f20的线程已经拿了锁0161cd68,却在等待锁 0161cd80

这样死锁就这么被确定下来,线程0和线程2互锁!

堆栈列信息解释如下:
   2  Id: 11264.9238 Suspend: 1 Teb: 7efd7000 UnfrozenChildEBP RetAddr  Args to Child              064efbf4 77709e2e 00000120 00000000 00000000 ntdll!ZwWaitForSingleObject+0x15 (FPO: [3,0,0])064efc58 77709d12 00000000 00000000 005819a0 ntdll!RtlpWaitOnCriticalSection+0x13e (FPO: [Non-Fpo])064efc80 01140f78 0161cd68 00000000 00000000 ntdll!RtlEnterCriticalSection+0x150 (FPO: [Non-Fpo])064efd58 014c22e3 003ef808 c5b8f482 00000000 LockDemon!MyFunc+0x48 (FPO: [Non-Fpo]) (CONV: stdcall) [e:\work\c++\test\lockdemon\lockdemon\lockdemondlg.cpp @ 79]064efd94 014c2254 00000000 064efdac 76e0336a LockDemon!_callthreadstartex+0x53 (CONV: cdecl) [f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c @ 348]064efda0 76e0336a 005819a0 064efdec 777092b2 LockDemon!_threadstartex+0xa4 (CONV: stdcall) [f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c @ 331]064efdac 777092b2 005819a0 6715418a 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])064efdec 77709285 014c21b0 005819a0 00000000 ntdll!__RtlUserThreadStart+0x70 (FPO: [Non-Fpo])064efe04 00000000 014c21b0 005819a0 00000000 ntdll!_RtlUserThreadStart+0x1b (FPO: [Non-Fpo])
第三列也就是后面API的第一个参数,对于API 即RtlEnterCriticalSection的第一个参数也就是第三列就是锁的地址,也就是线程id为2(线程地址为0x9238)的线程在等锁0x0161cd68
0 0
原创粉丝点击