strace学习

来源:互联网 发布:原始对偶知乎 编辑:程序博客网 时间:2024/06/05 10:39

关于strace的一点东西


好久没写博客了,感觉有点惭愧,觉得自己也应该静下心来利用自己可分配的时间去提升自己.
虽然最近在看一些Python的东西,可是觉得自己还是不能忘记本行啊,Linux C的一些东西必须一直好好温习.
不废话,今天打算写一个简单的东西, strace , 应该都对它或多或少的知道一些.很多时候除了core dump, backtrace的方式去排查程序的问题外, strace 跟踪调试也是不错的选择.

  看一小段测试code:

  /*======================================================================   *   *       Filename:  strace_use_test.c   *        Version:  1.0   *        Created:  2014年07月19日 09时40分49秒   *       Revision:  none   *       Compiler:  clang   *         Author:  sim szm, xianszm007@gmail.com   *        Company:  Class 1107 of Computer Science and Technology   *   * ======================================================================   */    #include <unistd.h>  #include <assert.h>  #include <stdio.h>  #include <stdlib.h>  #include <sys/types.h>  #include <fcntl.h>  #include <sys/stat.h>    int main(int argc, char *argv[])  {          int fd,i=0;          fd=open("/tmp/non_file",O_RDONLY);          if(fd<0)                i=2;          else                i=1;          return i;  }
编译: gcc strace_use_test.c -o strace_use
  生成跟踪文件: strace -o strace_use.txt ./strace_use (其实就是跟踪结果重定向到文件)
  我们来看下对应代码生成的strace_dump信息:
  szm@szm-Lenovo:~/some_code$ cat strace_use.txt   (1) execve("./strace_use", ["./strace_use"], [/* 74 vars */]) = 0  (2) brk(0)                                  = 0x8a92000  (3) uname({sys="Linux", node="szm-Lenovo", ...}) = 0  (4) access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)  (5) mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb775d000  (6) access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)  (7) open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3  (8) fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0  (9)mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb773c000  (10) close(3)                                = 0  (11) access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)  (12) open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3  (13) read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\226\1\0004\0\0\0"..., 512) = 512  (14) fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0  (15) mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb7592000  (16) mprotect(0xb7735000, 4096, PROT_NONE)   = 0  (17) mmap2(0xb7736000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7736000  (18) mmap2(0xb7739000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7739000  (19) close(3)                                = 0  (20) mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7591000  (21) set_thread_area({entry_number:-1 -> 6, base_addr:0xb75916c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0  (22) mprotect(0xb7736000, 8192, PROT_READ)   = 0  (23) mprotect(0x8049000, 4096, PROT_READ)    = 0  (24) mprotect(0xb777f000, 4096, PROT_READ)   = 0  (25) munmap(0xb773c000, 131876)              = 0  (26) open("/tmp/non_file", O_RDONLY)         = -1 ENOENT (No such file or directory)  (27) exit_group(2)                           = ?
(当然,标号是我自己编辑上去的,哈哈)

看到这一堆,也许你会看到一些熟悉的函数调用,但是每行干嘛的估计是不是有点迷惑.所以我们来说说每一行都在干什么.其实更多的就是每一行对应了一个系统调用:(下面的一些系统地址是我自己机子,换台机子肯定不同),差不多格式就是[系统调用的名( 参数... ) = 返回值 错误标志和描述] 这种样子,还是看看啥意思吧.
   1: 对于命令行下执行的程序,execve(或exec系列调用中的某一个)均为strace输出系统调用中的第一个。strace首先调用fork或clone函数新建一个子进程,然后在子进程中调用exec载入需要执行的程序(这里为./strace_use)
   2: 以0作为参数调用brk,返回值为内存管理的起始地址(若在子进程中调用malloc,则从0x8a92000地址开始分配空间)
   3: 调用access函数检验/etc/ld.so.nohwcap是否存在
   5: 使用mmap2函数进行匿名内存映射,以此来获取8192bytes内存空间,该空间起始地址为0xb775d000,匿名内存映射就是不涉及具体的文件名,避免了文件的创建及打开,很显然只能用于具有亲缘关系的进程间通信(这个解释也许有点简单,google一下吧)
   7: 调用open函数尝试打开/etc/ld.so.cache文件,返回文件描述符为3
   8: fstat64函数获取/etc/ld.so.cache文件信息
   9: 调用mmap2函数将/etc/ld.so.cache文件映射至内存
   10: close关闭文件描述符为3指向的/etc/ld.so.cache文件
   13: 调用read,从/lib/i386-linux-gnu/libc.so.6该libc库文件中读取512bytes,即读取ELF头信息
   16: 使用mprotect函数对0xb7735000,起始的4096bytes空间进行保护(PROT_NONE参数就是不能访问,对应还有PROT_READ表示可以读取)
   25: 调用munmap函数,将/etc/ld.so.cache文件从内存中去映射,与9行的mmap2函数对应
   26: 对应源码中使用到的唯一的系统调用——open函数,使用其打开/tmp/non_file文件
   27: 子进程结束,退出码为2(源码返回值)
  从之前的源码看来,真正能与源码对应上的只有open这一个系统调用(Line 26),其他系统调用几乎都用于进行进程初始化工作:装载被执行程序、载入libc函数库、设置内存映射等。而且源码中的if语句或其他代码在相应strace输出中并没有体现,因为它们并没有唤起系统调用。所以strace只关心程序与系统之间产生的交互,进而strace不适用于程序逻辑代码的排错和分析。当然对于Linux中几百个系统调用,上面strace输出的几个只是很小一块啦.
  
  下面说说strace一般常用的几点吧:
  首先那就是跟踪进程喽, 默认情况下,strace只跟踪指定的进程,而不对指定进程中新建的子进程进行跟踪。使用-f选项,可对进程中新建的子进程进行跟踪,并在输出结果中打印相应进程PID:
  我们吧strace_use_test.c的代码改为如下:

  /*======================================================================   *   *       Filename:  strace_use_test.c   *        Version:  1.0   *        Created:  2014年07月19日 09时40分49秒   *       Revision:  none   *       Compiler:  clang   *         Author:  sim szm, xianszm007@gmail.com   *        Company:  Class 1107 of Computer Science and Technology   *   * ======================================================================   */    #include <unistd.h>  #include <assert.h>  #include <stdio.h>  #include <stdlib.h>  #include <sys/types.h>  #include <fcntl.h>  #include <sys/stat.h>    int main(int argc, char *argv[])  {          pid_t pid;          pid=fork();          if(pid>0){                  printf("child do_something here !\n");          }else if(pid==0){                  printf("parent area print!\n");          }else{                  perror("fork error !");                  _exit(0);          }          return 0;  }  对应加上 -f 选项跟踪父子进程, dump 信息如下:
  szm@szm-Lenovo:~/some_code$ strace -o strace_use.txt -f ./strace_use   parent area print!  child do_something here !  szm@szm-Lenovo:~/some_code$ cat strace_use.txt   9256  execve("./strace_use", ["./strace_use"], [/* 74 vars */]) = 0  9256  brk(0)                            = 0x8d8d000  9256  uname({sys="Linux", node="szm-Lenovo", ...}) = 0  9256  access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)  9256  mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7799000  9256  access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)  9256  open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3  9256  fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0  9256  mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7778000  9256  close(3)                          = 0  9256  access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)  9256  open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3  9256  read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\226\1\0004\0\0\0"..., 512) = 512  9256  fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0  9256  mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75ce000  9256  mprotect(0xb7771000, 4096, PROT_NONE) = 0  9256  mmap2(0xb7772000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7772000  9256  mmap2(0xb7775000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7775000  9256  close(3)                          = 0  9256  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75cd000  9256  set_thread_area({entry_number:-1 -> 6, base_addr:0xb75cd6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0  9256  mprotect(0xb7772000, 8192, PROT_READ) = 0  9256  mprotect(0x8049000, 4096, PROT_READ) = 0  9256  mprotect(0xb77bb000, 4096, PROT_READ) = 0  9256  munmap(0xb7778000, 131876)        = 0  9256  clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb75cd728) = 9257  9257  fstat64(1,  <unfinished ...>  9256  fstat64(1,  <unfinished ...>  9257  <... fstat64 resumed> {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0  9256  <... fstat64 resumed> {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0  9257  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>  9256  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0 <unfinished ...>  9257  <... mmap2 resumed> )             = 0xb7798000  9256  <... mmap2 resumed> )             = 0xb7798000  9257  write(1, "parent area print!\n", 19 <unfinished ...>  9256  write(1, "child do_something here !\n", 26 <unfinished ...>  9257  <... write resumed> )             = 19  9256  <... write resumed> )             = 26  9257  exit_group(0)                     = ?  9256  exit_group(0)                     = ?
9256 clone(child_stack=0,flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0xb75cd728) = 9257
即9256这个进程内核进行clone()产生9527.

所以很多时候多进程的一些程序strace跟踪时,一般打开 -f 选项, 使用strace对运行中的程序进行跟踪,-p 参数即可,命令执行之后,被跟踪的进程照常执行,strace的其他选项也适用于运行中的进程跟踪。有时间要求的时候,strace还可以记录程序与系统交互时,各个系统调用发生时的时间信息,有r、t、tt、ttt、T等几个选项,它们记录时间的方式为:
  -T: 记录各个系统调用花费的时间,精确到微秒
  -r: 以第一个系统调用(通常为execve)计时,精确到微秒
  -t: 时:分:秒
  -tt: 时:分:秒. 微秒
  -ttt: 计算机纪元以来的秒数. 微秒
  
Strace 还可以用来判断程序挂起的问题,继续修改程序如下:

/*====================================================================== * *       Filename:  strace_use_test.c *        Version:  1.0 *        Created:  2014年07月19日 09时40分49秒 *       Revision:  none *       Compiler:  clang *         Author:  sim szm, xianszm007@gmail.com *        Company:  Class 1107 of Computer Science and Technology * * ====================================================================== */#include <unistd.h>#include <assert.h>#include <stdio.h>#include <stdlib.h>#include <sys/types.h>#include <fcntl.h>#include <sys/stat.h>int main(int argc, char *argv[]){        printf("flag see !\n");        if(strcmp(argv[1],"flag1")==0){                while(1);        }else if(strcmp(argv[1],"flag2")==0){                sleep(400);        }        return 0;}
flag1 可以用来模拟用户态死循环挂死状态,flag2 可以用来模拟内核态挂死状态.因为sleep()是陷入内核的.
看下对应的 strace dump结果:
首先对应的是flag1
szm@szm-Lenovo:~/some_code$ strace ./strace_use flag1execve("./strace_use", ["./strace_use", "flag1"], [/* 74 vars */]) = 0brk(0)                                  = 0x8cff000uname({sys="Linux", node="szm-Lenovo", ...}) = 0access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77a8000access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7787000close(3)                                = 0access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\226\1\0004\0\0\0"..., 512) = 512fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75dd000mprotect(0xb7780000, 4096, PROT_NONE)   = 0mmap2(0xb7781000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7781000mmap2(0xb7784000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7784000close(3)                                = 0mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75dc000set_thread_area({entry_number:-1 -> 6, base_addr:0xb75dc6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0mprotect(0xb7781000, 8192, PROT_READ)   = 0mprotect(0x8049000, 4096, PROT_READ)    = 0mprotect(0xb77ca000, 4096, PROT_READ)   = 0munmap(0xb7787000, 131876)              = 0fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77a7000write(1, "flag see !\n", 11flag see !)            = 11
一直卡在这里循环

再看看flag2:
szm@szm-Lenovo:~/some_code$ strace ./strace_use flag2execve("./strace_use", ["./strace_use", "flag2"], [/* 74 vars */]) = 0brk(0)                                  = 0x83c0000uname({sys="Linux", node="szm-Lenovo", ...}) = 0access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)mmap2(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb777a000access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3fstat64(3, {st_mode=S_IFREG|0644, st_size=131876, ...}) = 0mmap2(NULL, 131876, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7759000close(3)                                = 0access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0000\226\1\0004\0\0\0"..., 512) = 512fstat64(3, {st_mode=S_IFREG|0755, st_size=1730024, ...}) = 0mmap2(NULL, 1743580, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xb75af000mprotect(0xb7752000, 4096, PROT_NONE)   = 0mmap2(0xb7753000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a3) = 0xb7753000mmap2(0xb7756000, 10972, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xb7756000close(3)                                = 0mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb75ae000set_thread_area({entry_number:-1 -> 6, base_addr:0xb75ae6c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0mprotect(0xb7753000, 8192, PROT_READ)   = 0mprotect(0x8049000, 4096, PROT_READ)    = 0mprotect(0xb779c000, 4096, PROT_READ)   = 0munmap(0xb7759000, 131876)              = 0fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(136, 0), ...}) = 0mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7779000write(1, "flag see !\n", 11flag see !)            = 11rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0nanosleep({400, 0},
陷入sleep

  从输出的跟踪信息可以看出两中情况的不同, 当用户态挂死情况下,strace在write(由调用的printf产生write进描述符1,也就是标准输出) 一行输出之后没有其他系统调用输出;进程在内核态挂死,最后一行的系统调用nanosleep不能完整显示 (这里nanosleep没有返回值表示该调用尚未完成 ) 所以我们就可以知道:使用strace跟踪挂死程序,如果最后一行系统调用显示完整,程序在逻辑代码处挂死;如果最后一行系统调用显示不完整,程序在该系统调用处挂死。

  当程序挂死在系统调用处,我们可以查看相应系统调用的man手册,了解在什么情况下该系统调用会出现挂死情况。
  
  当然strace 还有其他一堆参数,就不说了,其实我也没看,哈哈.常用跟踪进程信息用法的就是strace -o output.txt -T -tt -e trace=all -f -p pid (-e trace=all 意思是跟踪所有系统调用)

Ok, 周末一上午写篇Blog 总比睡觉好.Over .

0 0
原创粉丝点击