使用GDB调试多线程程序

来源:互联网 发布:橄榄油哪个牌子好 知乎 编辑:程序博客网 时间:2024/05/20 11:21

最近一直在看Debugging with GDB,看了300多页,多少也有些收获,写了个多线程的程序调试了一下。很多时候,一个多线程程序运行了很长时间没有反应,可能是死锁或是在等待条件变量。以前用Visual Studio开发游戏时,也遇到过死锁的问题。Visual Studio调试时,我们可以点击“中断”按钮,这个时候我们就可以挂起进程,就可以查看里面的线程在做什么事。但如果我们用gdb在纯命令行下调试,也可以很方便的查看进程中各个线程的状态。

我们可以设置断点来调试,有三种类型的断点,分别是breakpoint,watchpoint和catchpoint,当断点被击中的时候程序就会中断,gdb获得控制权,我们就可以用thread命令,在不同的线程之间切换,用frame,bt命令查看程序调用堆栈。下面是一个例子。这个程序有九个线程,一个是主线程,其他是子线程。子线程负责完成任务,当所有任务完成后通知主线程。最后进程结束。

我们线设置breakpoint,breakpoint可以基于某个线程也可以基于所有线程。如果只想对某个线程起作用,就可以像下面那样设置断点。

(gdb) break 56 thread 3Breakpoint 4 at 0x8048c03: file main.cpp, line 56.(gdb) info breakpointsNum     Type           Disp Enb Address    What3       breakpoint     keep y   0x08048fb8 in main(int, char**) at main.cpp:175breakpoint already hit 1 time4       breakpoint     keep y   0x08048c03 in IsPrime(unsigned long) at main.cpp:56 thread 3stop only in thread 3(gdb) c  Continuing.[Switching to Thread 0xb74a8b40 (LWP 3611)]Breakpoint 4, IsPrime (n=2000) at main.cpp:5656    if (n == 2)(gdb) info threads  Id   Target Id         Frame   9    Thread 0xb44a2b40 (LWP 3617) "ThreadTest" IsPrime (n=5) at main.cpp:56  8    Thread 0xb4ca3b40 (LWP 3616) "ThreadTest" IsPrime (n=4001) at main.cpp:56  7    Thread 0xb54a4b40 (LWP 3615) "ThreadTest" IsPrime (n=3002) at main.cpp:56  6    Thread 0xb5ca5b40 (LWP 3614) "ThreadTest" 0x08048c07 in IsPrime (n=6000) at main.cpp:56  5    Thread 0xb64a6b40 (LWP 3613) "ThreadTest" IsPrime (n=7003) at main.cpp:56  4    Thread 0xb6ca7b40 (LWP 3612) "ThreadTest" 0x08048c07 in IsPrime (n=5000) at main.cpp:56* 3    Thread 0xb74a8b40 (LWP 3611) "ThreadTest" IsPrime (n=2000) at main.cpp:56  2    Thread 0xb7ca9b40 (LWP 3610) "ThreadTest" IsPrime (n=1004) at main.cpp:56  1    Thread 0xb7cab700 (LWP 3580) "ThreadTest" 0xb7fdd424 in __kernel_vsyscall ()
当前线程为thread 3,看看线程在干啥

(gdb) thread 3[Switching to thread 3 (Thread 0xb74a8b40 (LWP 3611))]#0  IsPrime (n=2000) at main.cpp:5656    if (n == 2)(gdb) frame 0#0  IsPrime (n=2000) at main.cpp:5656    if (n == 2)(gdb) bt#0  IsPrime (n=2000) at main.cpp:56#1  0x08048caf in CalCulatePrime (T=...) at main.cpp:84#2  0x08048e23 in ThreadFunc () at main.cpp:133#3  0xb7faef70 in start_thread () from /lib/i386-linux-gnu/libpthread.so.0#4  0xb7d98bee in clone () from /lib/i386-linux-gnu/libc.so.6
         也可以看其他线程,

(gdb) thread 9[Switching to thread 9 (Thread 0xb44a2b40 (LWP 3617))]#0  IsPrime (n=2005357) at main.cpp:6767    for (long i = 3; i <= tmp; i += 2)(gdb) thread 8[Switching to thread 8 (Thread 0xb4ca3b40 (LWP 3616))]#0  IsPrime (n=1043099) at main.cpp:6969        if (n % i == 0)
如果我们设置断点时不指定线程号,所有线程在断点击中都会中断。

(gdb) break 56Breakpoint 5 at 0x8048c03: file main.cpp, line 56.(gdb) c Continuing.[Switching to Thread 0xb5ca5b40 (LWP 3614)]Breakpoint 5, IsPrime (n=487728) at main.cpp:5656    if (n == 2)(gdb) info threads  Id   Target Id         Frame   9    Thread 0xb44a2b40 (LWP 3617) "ThreadTest" IsPrime (n=2005358) at main.cpp:56  8    Thread 0xb4ca3b40 (LWP 3616) "ThreadTest" IsPrime (n=1043100) at main.cpp:56  7    Thread 0xb54a4b40 (LWP 3615) "ThreadTest" IsPrime (n=123060) at main.cpp:56* 6    Thread 0xb5ca5b40 (LWP 3614) "ThreadTest" IsPrime (n=487728) at main.cpp:56  5    Thread 0xb64a6b40 (LWP 3613) "ThreadTest" IsPrime (n=1236144) at main.cpp:56  4    Thread 0xb6ca7b40 (LWP 3612) "ThreadTest" IsPrime (n=2396058) at main.cpp:56  3    Thread 0xb74a8b40 (LWP 3611) "ThreadTest" 0x08048c6c in IsPrime (n=2066951) at main.cpp:69  2    Thread 0xb7ca9b40 (LWP 3610) "ThreadTest" 0x08048c38 in IsPrime (n=226871) at main.cpp:66  1    Thread 0xb7cab700 (LWP 3580) "ThreadTest" 0xb7fdd424 in __kernel_vsyscall ()(gdb) cContinuing.[Switching to Thread 0xb74a8b40 (LWP 3611)]Breakpoint 5, IsPrime (n=2066952) at main.cpp:5656    if (n == 2)(gdb) info threads  Id   Target Id         Frame   9    Thread 0xb44a2b40 (LWP 3617) "ThreadTest" IsPrime (n=2005358) at main.cpp:56  8    Thread 0xb4ca3b40 (LWP 3616) "ThreadTest" IsPrime (n=1043100) at main.cpp:56  7    Thread 0xb54a4b40 (LWP 3615) "ThreadTest" IsPrime (n=123060) at main.cpp:56  6    Thread 0xb5ca5b40 (LWP 3614) "ThreadTest" IsPrime (n=487729) at main.cpp:56  5    Thread 0xb64a6b40 (LWP 3613) "ThreadTest" IsPrime (n=1236144) at main.cpp:56  4    Thread 0xb6ca7b40 (LWP 3612) "ThreadTest" IsPrime (n=2396058) at main.cpp:56* 3    Thread 0xb74a8b40 (LWP 3611) "ThreadTest" IsPrime (n=2066952) at main.cpp:56  2    Thread 0xb7ca9b40 (LWP 3610) "ThreadTest" 0x08048c38 in IsPrime (n=226871) at main.cpp:66  1    Thread 0xb7cab700 (LWP 3580) "ThreadTest" 0xb7fdd424 in __kernel_vsyscall ()
watchpoint是用来监视程序中某个变量的值,比如全局变量,设置watchpoints也可以指定线程,假设watchpoint只针对线程5,可以像下面那样敲命令。

(gdb) watch g_TaskCount thread 5Hardware watchpoint 6: g_TaskCount(gdb) info breakpointsNum     Type           Disp Enb Address    What3       breakpoint     keep y   0x08048fb8 in main(int, char**) at main.cpp:175breakpoint already hit 1 time5       breakpoint     keep y   0x08048c03 in IsPrime(unsigned long) at main.cpp:56breakpoint already hit 2 times6       hw watchpoint  keep y              g_TaskCount thread 5stop only in thread 5
运行一段时间后,程序停在了线程5的地方
(gdb) c   Continuing.ThreadID: 3066723136, Start: 23000, End: 2400000, MinResult: 23003, MaxResult: 2399993, SumResult: 203082982914ThreadID: 3024759616, Start: 24000, End: 2500000, MinResult: 24001, MaxResult: 2499997, SumResult: 219667603446ThreadID: 3075115840, Start: 22000, End: 2300000, MinResult: 22003, MaxResult: 2299963, SumResult: 187124976455ThreadID: 3033152320, Start: 25000, End: 2600000, MinResult: 25013, MaxResult: 2599999, SumResult: 237026367884ThreadID: 3058330432, Start: 26000, End: 2700000, MinResult: 26003, MaxResult: 2699999, SumResult: 254952179999[Switching to Thread 0xb64a6b40 (LWP 3613)]Hardware watchpoint 6: g_TaskCountOld value = 74New value = 73ThreadFunc () at main.cpp:141141        if (g_TaskCount == 0)(gdb) info threads  Id   Target Id         Frame   9    Thread 0xb44a2b40 (LWP 3617) "ThreadTest" IsPrime (n=1748894) at main.cpp:55  8    Thread 0xb4ca3b40 (LWP 3616) "ThreadTest" 0x08048c6c in IsPrime (n=532799) at main.cpp:69  7    Thread 0xb54a4b40 (LWP 3615) "ThreadTest" 0x08048c6c in IsPrime (n=2223211) at main.cpp:69  6    Thread 0xb5ca5b40 (LWP 3614) "ThreadTest" 0x08048c6c in IsPrime (n=2350409) at main.cpp:69* 5    Thread 0xb64a6b40 (LWP 3613) "ThreadTest" ThreadFunc () at main.cpp:141  4    Thread 0xb6ca7b40 (LWP 3612) "ThreadTest" 0x08048c6c in IsPrime (n=2322967) at main.cpp:69  3    Thread 0xb74a8b40 (LWP 3611) "ThreadTest" 0x08048c6c in IsPrime (n=1896313) at main.cpp:69  2    Thread 0xb7ca9b40 (LWP 3610) "ThreadTest" IsPrime (n=1856861) at main.cpp:69  1    Thread 0xb7cab700 (LWP 3580) "ThreadTest" 0xb7fdd424 in __kernel_vsyscall ()(gdb) thread 9[Switching to thread 9 (Thread 0xb44a2b40 (LWP 3617))]#0  IsPrime (n=1748894) at main.cpp:5555{(gdb) thread 5[Switching to thread 5 (Thread 0xb64a6b40 (LWP 3613))]#0  ThreadFunc () at main.cpp:141141        if (g_TaskCount == 0)
catchpoint的作用是当某个事件触发是暂停程序,如signal,exception,fork,vfork,exec等等。下面以signal为例,如果要捕捉SIGINT信号,也就是按下CTRL+C时暂停程序。默认情况下,调试时按下CTRL+C,也确实会暂停。使用info signals查看一下

(gdb) info signalsSignal        StopPrintPass to programDescriptionSIGHUP        YesYesYesHangupSIGINT        YesYesNoInterrupt
设置catchpoint,敲下面的命令,然后再查看断点列表。

(gdb) catch signal SIGINTCatchpoint 7 (signal SIGINT)(gdb) info breakpointsNum     Type           Disp Enb Address    What3       breakpoint     keep y   0x08048fb8 in main(int, char**) at main.cpp:175breakpoint already hit 1 time6       hw watchpoint  keep y              g_TaskCount thread 5stop only in thread 5breakpoint already hit 1 time7       catchpoint     keep y              signal "SIGINT" 
        这样就设置了catchpoint,当按下CTRL+C时程序就会中断。

(gdb) delete 3 6(gdb) info breakpointsNum     Type           Disp Enb Address    What7       catchpoint     keep y              signal "SIGINT" (gdb) c  Continuing.ThreadID: 3049937728, Start: 27000, End: 2800000, MinResult: 27011, MaxResult: 2799991, SumResult: 273421308155ThreadID: 3066723136, Start: 30000, End: 3100000, MinResult: 30011, MaxResult: 3099997, SumResult: 332787964475ThreadID: 3083508544, Start: 28000, End: 2900000, MinResult: 28001, MaxResult: 2899997, SumResult: 292644787390ThreadID: 3075115840, Start: 32000, End: 3300000, MinResult: 32003, MaxResult: 3299969, SumResult: 375684874646ThreadID: 3041545024, Start: 29000, End: 3000000, MinResult: 29009, MaxResult: 2999999, SumResult: 312428107775ThreadID: 3024759616, Start: 31000, End: 3200000, MinResult: 31013, MaxResult: 3199997, SumResult: 353944240421ThreadID: 3033152320, Start: 33000, End: 3400000, MinResult: 33013, MaxResult: 3399997, SumResult: 397922227323^C[Switching to Thread 0xb7cab700 (LWP 3580)]Catchpoint 7 (signal SIGINT), 0xb7fdd424 in __kernel_vsyscall ()
看来程序确实中断在catchpoint那里了。

这就是我用gdb调试多线程程序的方法,还望各位多多指教。







0 0