Windb 用户态程序调试分析

来源:互联网 发布:snmp windows 2008 编辑:程序博客网 时间:2024/04/29 08:51

Windbg简单使用

设置符号表:

1)打开:File->symbol file path 设置”.sympathSRV*D:\ProgramFiles\localsymbols*http://msdl.microsoft.com/download/symbols”

其中D:\ProgramFiles\localsymbols是本地存储符号表的路径。

2)设置好后,关闭windbg,重新打开。


分析内存泄露的例子
•现在我们写了下面一个测试程序

 void AllocateMemory()

{

  int*a = new int[100];

  Sleep(100);

}

int _tmain(int argc, _TCHAR* argv[])

{

  while(1)

  {

  AllocateMemory();

  }

  return0;

}

•我们运行程序,并用windbg:file->attachto a process

选择我们要的进程。

•现在我们在下面命令行敲入:!heap –s
Heap    Flags   Reserv  Commit Virt   Free List   UCR  Virt Lock  Fast

                    (k)     (k)   (k)     (k) length      blocks cont. heap

-----------------------------------------------------------------------------

00150000 58000062    1024    12     12      0    0     1    0     0   L 

00250000 58001062      64    24     24     15    1     1    0     0   L 

00260000 58008060      64    12     12     10    1     1    0     0     

00380000 58001062    1088    88     88      0    0     1    0     0   L 

-----------------------------------------------------------------------------

(命令的帮助 .hh keyword 或者help->contents)

•现在我们让程序继续运行:g
•我们再次敲入:!Heap –s

Heap    Flags   Reserv  Commit Virt   Free List   UCR  Virt Lock  Fast

                    (k)     (k)   (k)     (k) length      blocks cont. heap

-----------------------------------------------------------------------------

00150000 58000062    1024    12     12      0    0     1    0     0   L 

00250000 58001062      64    24     24     15    1     1    0     0   L 

00260000 58008060      64    12     12     10    1     1    0     0     

00380000 58001062    1088   120    120      1    1     1    0     0   L

•在上面的2次运行中,我们发现标红的2个地方是有内存增加的。那么它就是我们怀疑对象,现在我们继续敲入: !heap -stat -h 00380000

size    #blocks     total     ( %) (percent of total busy bytes)

   190 185 - 25fd0  (93.90)

   800 2 - 1000  (2.47)

   214 2 - 428  (0.64)

   220 1 - 220  (0.33)

   18c 1 - 18c  (0.24)

   80 3 - 180  (0.23)

   14e 1 - 14e  (0.20)

   68 2 - d0  (0.13)

   c6 1 - c6  (0.12)

   24 5 - b4  (0.11)

   a7 1 - a7  (0.10)

   50 2 - a0  (0.10)

   28 4 - a0  (0.10)

   92 1 - 92  (0.09)

   8a 1 - 8a  (0.08)

   82 1 - 82  (0.08)

   7c 1 - 7c  (0.07)

   34 2 - 68  (0.06)

   62 1 - 62  (0.06)

   5e 1 - 5e  (0.06)

•上面标红的是可疑的分配内存,如果我们还是无法确定是这块内存,可以在g一次,然后看看是否它比例发生了改变。
•现在我们敲入: !heap -flt s 190

    HEAP_ENTRY Size Prev Flags   UserPtrUserSize– state

      01429110 0035 0035  [07]  01429118    00190 - (busy)

       014292b8 0035 0035  [07]   014292c0   00190 - (busy)

       01429460 0035 0035  [07]   01429468   00190 - (busy)

       01429608 0035 0035  [07]   01429610   00190 - (busy)

       014297b0 0035 0035  [07]   014297b8   00190 - (busy)

       01429958 0035 0035  [07]   01429960   00190 - (busy)

       01429b00 0035 0035  [07]   01429b08   00190 - (busy)

       01429ca8 0035 0035  [07]   01429cb0   00190 - (busy)

       01429e50 0036 0035  [07]   01429e58   00190 - (busy)

       0142a000 0035 0036  [07]   0142a008   00190 - (busy)

       0142a1a8 0035 0035  [07]   0142a1b0   00190 - (busy)

       0142a350 0035 0035  [07]   0142a358   00190 - (busy)

       0142a4f8 0035 0035  [07]   0142a500   00190 - (busy)

       0142a6a0 0035 0035  [07]   0142a6a8   00190 - (busy)

       0142a848 0035 0035  [07]   0142a850   00190 - (busy)

       0142a9f0 0035 0035  [07]   0142a9f8   00190 - (busy)

       0142ab98 0035 0035  [07]   0142aba0   00190 - (busy)

       0142ad40 0035 0035  [07]   0142ad48   00190 - (busy)

       0142aee8 0035 0035  [07]   0142aef0   00190 - (busy)

•现在敲入: !heap -p -a 01429110

      HEAP_ENTRY Size Prev Flags   UserPtrUserSize- state

       014297b0 0035 0000  [07]   014297b8   00190 - (busy)

       Trace: 0045

       7c98d6dc ntdll!RtlDebugAllocateHeap+0x000000e1

       7c969d18 ntdll!RtlAllocateHeapSlowly+0x00000044

       7c93b298 ntdll!RtlAllocateHeap+0x00000e64

       78583a58 MSVCR90!malloc+0x00000079

       78583b58 MSVCR90!operator new+0x0000001f

       40101a MemLeak!wmain+0x0000001a

       7c816ff7 kernel32!BaseProcessStart+0x00000023

    现在我们看到堆栈了,分析这段可疑代码。

•在前面,我们其实还忽视了一个细节,否则我们不能查看到内存分配的堆栈,在程序启动前,我们需要运行下面一个语句把开关打开:gflags.exe  -i MemLeak.exe +ust

    如果要把开关去掉,命令为:gflags.exe  -iMemLeak.exe –ust

         参考:http://www.codeproject.com/KB/cpp/MemoryLeak.aspx


•如果是virtual block怎么办呢?
Ø!heap–p –all 两次查看,就知道那块虚拟内存是多出来的。

      invalid allocation size, possible heap corruption

     04ff001840004 0004  [0b]   04ff0020   200024 - (busy VirtualAlloc)

Ø接着我们运行!heap –p –a 04ff0018,就可以看到我们梦寐以求的堆栈信息了。
Ø回头想想,当初我们用!heap –s 获得的信息:

      Virtual block: 04ff0000 - 04ff0000 (size 00000000)

      原来,这里有个0x18个字节的偏移。


死锁分析
#define MAX_THREADS  4

class CriticalSection

{

public:

  CriticalSection()

  {

  InitializeCriticalSection(&m_criticalSection);

  }

  ~CriticalSection()

  {

  DeleteCriticalSection(&m_criticalSection);

  }

  inlinevoid Lock()

  {

  EnterCriticalSection(&m_criticalSection);

  }

  inlinevoid Unlock()

  {

  LeaveCriticalSection(&m_criticalSection);

  }

  inlineCRITICAL_SECTION* getCriticalSection()

  {

  return&m_criticalSection;

  }

private:

  CRITICAL_SECTION  m_criticalSection;

};


CriticalSection g_oneLock1;

CriticalSection g_oneLock2;

DWORD WINAPI MyThread1( LPVOID lpParam)

{

  g_oneLock1.Lock();

  Sleep(1000);

  g_oneLock2.Lock();

  Sleep(1000);

  g_oneLock1.Unlock();

  g_oneLock2.Unlock();

   return 0;

}

DWORD WINAPI MyThread2( LPVOID lpParam)

{

  g_oneLock2.Lock();

  Sleep(800);

  g_oneLock1.Lock();

  Sleep(800);

  g_oneLock2.Unlock();

  g_oneLock1.Unlock();

   return 0;

int _tmain(int argc, _TCHAR* argv[])

{

   DWORD dwThreadId[MAX_THREADS];

   HANDLE hThread[MAX_THREADS];

   int i;

   // Create MAX_THREADS worker threads.

   for( i=0;i<MAX_THREADS;i++)

   {

                                if (0 == i%2)

  {

  hThread[i]= CreateThread(

  NULL,              // default security attributes

  0,                 // use default stack size 

  MyThread1,          // thread function

  NULL,             // argument to thread function

  0,                 // use default creation flags

  &dwThreadId[i]);   // returns the thread identifier

  }

  else

  {

  hThread[i]= CreateThread(

  NULL,              // default security attributes

  0,                 // use default stack size 

  MyThread2,          // thread function

  NULL,             // argument to thread function

  0,                 // use default creation flags

  &dwThreadId[i]);   // returns the thread identifier

  }

   }

   WaitForMultipleObjects(MAX_THREADS, hThread,TRUE, INFINITE);

   for(i=0;i<MAX_THREADS;i++)

   {

       CloseHandle(hThread[i]);

   }

   return 0;

}

•通过Windbg运行程序或者attach它,敲入:~*kv

  1  Id: fb0.fe0 Suspend: 1 Teb:7ffdc000 Unfrozen

ChildEBPRetAddr  Args to Child             

0060ff10 7c92e9c0 7c93901b 000007cc00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

0060ff14 7c93901b 000007cc 0000000000000000 ntdll!ZwWaitForSingleObject+0xc (FPO: [3,0,0])

0060ff9c 7c92104b 00403370 0040107300403370 ntdll!RtlpWaitForCriticalSection+0x132 (FPO: [Non-Fpo])

0060ffa4 00401073 00403370 00020290 00000002ntdll!RtlEnterCriticalSection+0x46 (FPO: [1,0,0])

0060ffb4 7c80b6a3 00000000 0002029000000002 deadLock!MyThread2+0x23 (FPO: [1,0,2]) (CONV: stdcall)[e:\test\deadlock\deadlock\deadlock.cpp @ 60]

0060ffec 00000000 00401050 0000000000000000 kernel32!BaseThreadStart+0x37 (FPO: [Non-Fpo])

  2  Id: fb0.e3c Suspend: 1 Teb:7ffda000 Unfrozen

ChildEBPRetAddr  Args to Child             

0080ff10 7c92e9c0 7c93901b 000007d000000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

0080ff14 7c93901b 000007d0 0000000000000000 ntdll!ZwWaitForSingleObject+0xc (FPO: [3,0,0])

0080ff9c 7c92104b 00403388 0040105f00403388 ntdll!RtlpWaitForCriticalSection+0x132 (FPO: [Non-Fpo])

0080ffa4 0040105f 00403388 00020290 00000002ntdll!RtlEnterCriticalSection+0x46 (FPO: [1,0,0])

0080ffb4 7c80b6a3 00000000 0002029000000002 deadLock!MyThread2+0xf (FPO: [1,0,2]) (CONV: stdcall)[e:\test\deadlock\deadlock\deadlock.cpp @ 58]

0080ffec 00000000 00401050 0000000000000000 kernel32!BaseThreadStart+0x37 (FPO: [Non-Fpo])

  3  Id: fb0.e54 Suspend: 1 Teb:7ffdb000 Unfrozen

ChildEBPRetAddr  Args to Child             

0070ff10 7c92e9c0 7c93901b 000007d000000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

0070ff14 7c93901b 000007d0 0000000000000000 ntdll!ZwWaitForSingleObject+0xc (FPO: [3,0,0])

0070ff9c 7c92104b 00403388 0040102300403388 ntdll!RtlpWaitForCriticalSection+0x132 (FPO: [Non-Fpo])

0070ffa4 00401023 00403388 00020290 00000002ntdll!RtlEnterCriticalSection+0x46 (FPO: [1,0,0])

0070ffb4 7c80b6a3 00000000 0002029000000002 deadLock!MyThread1+0x23 (FPO: [1,0,2]) (CONV: stdcall)[e:\test\deadlock\deadlock\deadlock.cpp @ 49]

0070ffec 00000000 00401000 0000000000000000 kernel32!BaseThreadStart+0x37 (FPO: [Non-Fpo])

  4  Id: fb0.c64 Suspend: 1 Teb:7ffde000 Unfrozen

ChildEBPRetAddr  Args to Child             

0050ff10 7c92e9c0 7c93901b 000007cc00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])

0050ff14 7c93901b 000007cc 0000000000000000 ntdll!ZwWaitForSingleObject+0xc (FPO: [3,0,0])

0050ff9c 7c92104b 00403370 0040100f00403370 ntdll!RtlpWaitForCriticalSection+0x132 (FPO: [Non-Fpo])

0050ffa4 0040100f 00403370 00020290 00000002ntdll!RtlEnterCriticalSection+0x46 (FPO: [1,0,0])

0050ffb4 7c80b6a3 00000000 0002029000000002 deadLock!MyThread1+0xf (FPO: [1,0,2]) (CONV: stdcall)[e:\test\deadlock\deadlock\deadlock.cpp @ 47]

0050ffec 00000000 00401000 0000000000000000 kernel32!BaseThreadStart+0x37 (FPO: [Non-Fpo])

•上面列出信息标红的就是cs的值(即函数的第1个参数),现在我们可以看到是2个cs值,下面我们敲入:!cs 00403388

         Critical section   = 0x00403388(deadLock!g_oneLock2+0x0)

         DebugInfo          = 0x7c99c900

         LOCKED

         LockCount          = 0x2

         OwningThread      = 0x00000fe0

         RecursionCount    = 0x1

         LockSemaphore     = 0x7D0

         SpinCount          = 0x00000000

         从上面我们可以看到,获取锁的线程序ID为: 0x00000fe0,该锁被进入2次。那么接下来,我们去查看哪个线程的状况,在上面~*kv的结果中可以找到。我们会发现该线程也被锁定,同样我们敲入: !cs 00403370

        Critical section   = 0x00403370(deadLock!g_oneLock1+0x0)

        DebugInfo          = 0x7c99c8e0

        LOCKED

        LockCount          = 0x2

        OwningThread      = 0x00000e54

        RecursionCount    = 0x1

        LockSemaphore     = 0x7CC

        SpinCount          = 0x00000000

        同样,我们发现它被线程: 0x00000e54获取着,这样2个线程就构成了相互依赖而死锁。(如果只是minidump,就无法用!cs来获取更详细的信息,只能看堆栈来进行信息查看了。完整的dump文件可以用。)

获取dump文件
•adplus.vbs可以有2种方式来获取dump文件,-hang或-crash(查看帮助:adplus.vbs -help)。其中-hang模式,对正在执行的程序没有影响; -crash是在异常崩溃时捕获。

    adplus.vbs -hang -pndeadlock.exe -o h:\hang

        adplus.vbs -crash -p 2924 -o h:\crash

        参数说明:-pn进程名;-p 进程id;-o 路径;


Windbg生成dump文件

  调试状态:命令行敲入:dump /f h:\dump.dmp

          参数说明:/f 为完全dump(如果要捕获minidump省略/f);h:\dump.dmp为路径

自己写触发异常捕获生成minidump  (附件有代码testExc)
•可以打开很多设置选择,供调试程序使用

         如:gflags.exe -imemSlop.exe +hpa

void functionOne()

{

  char*p= new char[1024];

  p[1024]=1;

}

void functionTwo()

{

  char*p=new char[1024];

  delete[]p;

  delete[]p;

}

int_tmain(intargc,_TCHAR* argv[])

{

  functionOne();

  functionTwo();

  return0;

}

这样,上面代码在release下也将崩溃。实现机制详细可参考,pageheap的原理。

•当我们程序运行,却不知道什么时候发生问题,那该怎么办呢?总不能人24小时的等候介入。解决这个办法,我们可以通过触发异常来获取minidump,也可以通过kb,bat的结合来打印出足够的信息,供我们参考。

 

             写个bat文件,信息如下:

                  set outFile=d:\out%1.txt

                 "C:\ProgramFiles\Debugging Tools for Windows (x86)\cdb" -pv -pn NetDiskServer.exe -logo %outFile%-lines -c "~*kv;q"

   当发生问题的时候,我们可以调用ShellExecute(NULL,_T(“open”),_T(“run.bat”),sPara,sPath,SW_SHOWNORMAL);就可以获得当前程序执行的线程状况。

        在windbg中的命令均可用。


•调试工具只是协助调试,不是解决代码问题的最好办法。良好的编程习惯,才是王道!毕竟大型服务调试分析问题花费的时间是巨大的,而问题的可再现也是未知的。



原创粉丝点击