Linux perf: 为什么采样频率设置为99Hz而不是100Hz?

来源:互联网 发布:中金所杯 知乎 编辑:程序博客网 时间:2024/03/29 17:31

当我们用perf去采样的时候,我们经常设置的频率是99,而不是100这样的频率,譬如:

# sudo perf record -F 99 -a -g -- sleep 20

[ perf record: Woken up 1 times to write data ][ perf record: Captured and wrote 0.560 MB perf.data (~24472 samples) ]

Options are:

  • -F 99: sample at 99 Hertz (samples per second). I'll sometimes sample faster than this (up to 999 Hertz), but that also costs overhead. 99 Hertz should be negligible. Also,the value '99' and not '100' is to avoid lockstep sampling, which can produce skewed results.

上面的解释是设置为100,会引起lockstep采样。(上面一段话取自:

http://www.brendangregg.com/blog/2014-06-22/perf-cpu-sample.html)

那么,什么是lockstep采样呢?我们来看看https://stackoverflow.com/questions/45470758/what-is-lockstep-sampling

上面的回答:

Lockstep sampling is when the profiling samples occur at the same frequency as a loop in the application. The result of this would be that the sample often occurs at the same place in the loop, so it will think that that operation is the most common operation, and a likely bottleneck.

An analogy would be if you were trying to determine whether a road experiences congestion, and you sample it every 24 hours. That sample is likely to be in lock-step with traffic variation;if it's at 8am or 5pm, it will coincide with rush hour and conclude that the road is extremely busy; if it's at 3am it will conclude that there's practically no traffic at all.

For sampling to be accurate, it needs to avoid this. Ideally, the samples should be much more frequent than any cycles in the application, or at random intervals, so that the chance it occurs in any particular operation is proportional to the amount of time that operation takes. But this is often not feasible, so the next best thing is to use a sampling rate that doesn't coincide with the likely frequency of program cycles. If there are enough cycles in the program, this should ensure that the samples take place at many different offsets from the beginning of each cycle.

To relate this to the above analogy, sampling every 23 hours or at random times each day will cause the samples to eventually encounter all times of the day; every 23-day cycle of samples will include all hours of the day. This produces a more complete picture of the traffic levels. And sampling every hour would provide a complete picture in just a few weeks.

I'm not sure why odd-numbered frequencies are likely to ensure this. It seems to be based on an assumption that there are natural frequencies for program operations, and these are even.


相信这个概念对于我们做基于时间采样的性能分析,是十分有帮助的。

与其相忘于江湖,不如点击二维码关注Linuxer

原创粉丝点击