gprof使用

来源：互联网发布：mac队装备编辑：程序博客网时间：2024/05/18 00:55

参考：http://blog.sina.com.cn/s/blog_6608391701013phr.html

简介

GNU gprof 是一款linux平台上的程序分析软件。可以显示程序运行的“flat profile”，包括每个函数的调用次数，每个函数消耗的处理器时间。也可以显示“调用图”，包括函数的调用关系，每个函数调用花费了多少时间。还可以显示“注释的源代码”，是程序源代码的一个复本，标记有程序中每行代码的执行次数。

gprof原理

类似于gdb，gprof需要对待分析的程序做一些改动，因此在程序编译的时候需要加上"-pg"选项，如果程序的某个模块在编译的时候没有加上"-pg"，则该模块的函数会被排除在统计范围之外。比如想要查看库函数的profiling，则需在链接库函数的时候用“-lc_p"代替”-lc" （gprof是各个类UNIX的标准工具，系统自带的链接库通常有两个版本，它们的区别在于编译的时候是否加上了"-pg"。用-lc_p等于告诉编译器选择加上了"-pg"的那个版本）。

加上"-pg"选项后，程序的入口会于main()之前调用monstartup()，主要是申请内存存储接下来获取的统计信息。在每个函数中会调用mcount()，主要是在函数的堆栈中查询父函数和子函数的地址并保存下来。最后会在程序退出前调用_mcleanup()，将统计结果保存到gmon.out中，并完成清除工作。

gprof统计各个函数的运行时间是采用的抽样的方法，周期性的查看Program counter指向哪一个函数的地址段，并把结果以直方图的形式保存下来。

PS：

有人建议在编译时不要加上"-g"选项，因为这样可能会影响分析结果。

通常gprof的采样周期是0.01s，统计项越接近这个值误差可能越大。若函数的运行时间低于0.01S，统计值会显示为0。

输出格式

1) flat profile

每一个函数花费了多少时间，每一个函数被调用了多少次

2) call graph

对每一个函数来说，哪个函数调用了它，多少次；它调用了哪些个函数，多少次；这个函数花费了多少时间，它调用的函数花费了多少时间；

3) annotated source

生成一份源代码的拷贝，标注出每一个block被执行了多少次

使用方法

使用步骤：

1) 程序在编译加上“-g -pg”选项，链接时加上“-pg”选项；

2) 执行程序。程序执行结束后，会生成gmon.out 统计文件，就是profile-data-file。
注意：只有在程序正常终止的时候，才会生成这个文件。也就是说，程序必须是从exit或者return终止的。

3) 使用gprof可查看gmon.out中的统计结果：

gprof <options> [executable-file][profile-data-file(s)……] [>outfile]

命令输出相关选项：

1) -b或--brief：不输出对各个参数含义的解释。

2) -A[symspec]或--annotated-source[=symspec]：输出注释的源码。如果指定了symspec，则只输出symspec指定的函数，未指定则输出全部。

3) -p[symspec]或--flat-profile[=symspec]：输出flat profile。如果指定了symspec，则只统计symspec指定的函数，未指定则统计全部。

4) -P[symspec]或--no-flat-profile[=symspec]：排除统计symspec指定的函数。

5) -q[symspec]或--graph[=symspec]：输出call graph。如果指定了symspec，则只统计symspec指定的函数，未指定则统计全部。

6) -Q[symspec]或--no-graph[=symspec]：排除统计symspec指定的函数。

命令分析相关选项：

1) -m num或--min-count=num：不显示被调用次数小于num的函数。

2) -z或--display-unused-functions：显示没有被调用的函数。

示例

sum.c:

#define MAX10000000

void f()

{

long long sum = 0;

long long i = 0;

for ( i=0; i<MAX; i++ )

sum += i;

}

void g()

{

long long sum = 0;

long long i = 0;

for ( i=0; i<MAX; i++ )

sum += i;

f();

}

int main()

{

long long sum = 0;

long long i = 0;

for ( i=0; i<MAX; i++ )

sum += i;

f();

g();

return 0;

}

$ gcc -g -pg -o./sum ./sum.c

$ ./sum

$ gprof -b ./sum./gmon.out

输出：

Flat profile:

Each samplecounts as 0.01 seconds.

% cumulative self self total

time seconds seconds calls ms/call ms/call name

50.00 0.06 0.06 2 30.00 30.00 f

25.00 0.09 0.03 1 30.00 60.00 g

25.00 0.12 0.03 main

%time：各个函数占用的时间比率（不包括子函数），这一列加起来应该为100%

cumulativeseconds：累积时间，当前行减去上一行即为当前函数耗费时间

selfseconds：当前函数耗费时间（不包括子函数）

selfcalls：调用次数

ms/call：调用一次耗费的平均时间（不包括子函数），单位毫秒

totalms/call：同上，但包括子函数

name：函数名

Call graph

granularity:each sample hit covers 4 byte(s) for 7.14% of 0.14 seconds

index % time self children called name

[1] 100.0 0.03 0.11 main [1]

0.04 0.04 1/1 g() [2]

0.04 0.00 1/2 f() [3]

-----------------------------------------------

0.04 0.04 1/1 main [1]

[2] 53.6 0.04 0.04 1 g() [2]

0.04 0.00 1/2 f() [3]

-----------------------------------------------

0.04 0.00 1/2 g() [2]

0.04 0.00 1/2 main [1]

[3] 50.0 0.07 0.00 2 f() [3]

-----------------------------------------------

每个函数都分配了一个index，index按升序排列，一个函数对应一个entry，两个entry之间用虚线隔开。

在每个entry中，以[index]起头的行称为primary line。primary line上面的行称为caller's line，列举的是调用该函数的函数；下面的行subroutine's line列举的是该函数调用的子函数。这三种line的各项名称虽然相同，但有着截然不同的含义。

以下都以第二个entry为例说明：

primary line

index % time self children called name

[2] 53.6 0.04 0.04 1 g() [2]

%time：g()耗费的时间比率。该比率包括了调用的f()，因此各个entry的该项数字加起来不等于100%。

self：同flat table的self seconds。

children：f()耗费的时间。下面的subroutines's line 的self项和children项之和应等于该数值。

called：只被调用了一次。

subroutine's line

index % time self children called name

0.04 0.00 1/2 f() [3]

self：f()被g()调用过程中，f()的耗费时间0.04s。

children：f()被g()调用过程中，f()中的子函数耗费时间为0。

called：f()一共被调用了2次，其中有1次被g()调用。

caller's line

index % time self children called name

0.04 0.04 1/1 main [1]

self：g()被main()调用过程中，g()的耗费时间。

children：g()被main()调用过程中，g()中的子函数耗费的时间。

called：g()一共被调用了1次，其中1次是被main()调用的。