RDTSCP inaccuracy at Intel i7
来源:互联网 发布:旅游必备软件 编辑:程序博客网 时间:2024/04/28 16:53
In 2010, an Intel guy Gabriele Paoloni wrote a white paper "How to Benchmark Code Execution Times on Intel® IA-32 and IA-64 Instruction Set Architectures", describing precise methods to measure the clock cycles required to execute specific C code in a Linux environment by RDTSC/RDTSCP. In his paper, he addresses 3 problems that may harm measurement reliability. The 1st is instruction cache; the 2nd is CPU preemption (task scheduling ,interrupt.. ); The 3rd is out of order execution. He resolves those problems by using some kernel functions as well as CPU instructions and finally demonstrates a extremely reliable measurement cost (i.e. measuring no instruction) with the min time 44 cycles and the variance of 2~3. That's awesome!
Since Gabriele announces his source code in the paper, it is straightforward for me to replicate his experiment, but I couldn't get the comparable result. The experiment consists of two loops. The inner loop measures no instruction for 100K times. The outer loop repeats 1K times of the inner loop. However, even with identical source code, I couldn't gain stable result set on my Intel i7 workstation. In a trial, the min cycles varies from 40 to 122; the variances varies from 15326 to 138.
The CPU on my platform is Intel Core i7-4779K CPU @ 3.50GHz. Maybe the i7 introduces new features that harm RDTSCP? Can anybody provide some ideas?
- RDTSCP inaccuracy at Intel i7
- Intel Core i7
- intel cpu 分类 i7、i5、i3、T系列、P系列
- 买了台Intel i7 2600K的机器
- LLVM CLANG 3.1 GCC 4.7 INTEL CORE I7 BENCHMARKS
- intel cpu 分类 i7、i5、i3、T系列、P系列
- intel cpu 分类 i7、i5、i3、T系列、P系列
- intel cpu 分类 i7、i5、i3、T系列、P系列
- Intel CPU 概览——从8086到四代酷睿i7
- 鸡头VS凤尾!Intel酷睿i7-5820K CPU评测
- 【壁上观】AMD ZEN将至能战8核i7 Intel慌不慌?
- AT&T 和Intel
- Gentoo Linux安装--Intel型号CPU的CFlags设置(包括intel core i7 & i5)
- Intel® Core™ i7-6560U Processor + Intel® Iris™ Graphics 540 参数
- Intel 酷睿i5 6300HQ与Intel 酷睿i7 6700HQ哪个好
- INTEL汇编,AT&T汇编-->"if","at"
- Intel and AT&T Syntax
- [轉]Intel 引爆 i 世代的全新 Core i7 與 Core i5
- javascript中创建数组的三种方式
- Delphi XE4 For IOS中程序的调试(虚拟机,真实机和win32)
- 反射的具体应用反射出方法和类等等
- hdu 4928 Series 2
- NSDictionary , NSMutableDictionary 和 NSSet , NSMutableSet的使用方法
- RDTSCP inaccuracy at Intel i7
- Ubuntu14.04怎样默认进入root用户
- CSS+DIV+HTML(一)--HTML总结
- Delphi XE4 For IOS 之SQLite初试
- 数据结构——线性表的C++实现
- 网络爬虫(二)-----处理HTTP状态码
- A long stick(深度搜索)
- php向js函数传参
- winfrom datagridview下方显示统计列