Vtune: amplxe-cl 命令行使用
来源:互联网 发布:windows pdf阅读器下载 编辑:程序博客网 时间:2024/06/08 07:07
参考文献
点击打开链接
http://software.intel.com/sites/products/documentation/hpc/amplifierxe/en-us/2011Update/lin/ug_docs/index.htm
amplxe-cl -collect hotspots -- ./driver /home/zxx/work_autumn_2011/matrices/rma10.mtx
Reading sparse matrix from file (/home/zxx/work_autumn_2011/matrices/rma10.mtx): done
Using 46835-by-46835 matrix with 2374001 nonzero values
------------------------------------------
#### Testing COO Kernels ####
creating coo_matrix:coo transform time elapsed 0.013690
do coo spmv time elapsed 5.434732 seconds
orignal do coo spmv time elapsed 5.429192 seconds
Using result path `/home/zxx/work_autumn_2011/all_format/r001hs'
Executing actions 75 % Generating a report
Summary
-------
Elapsed Time: 11.312
CPU Time: 11.280
Executing actions 100 % done
amplxe-cl -report hotspots -result-dir r001hs
Using result path `/home/zxx/work_autumn_2011/all_format/r001hs'
Executing actions 75 % Generating a reportFunction Module CPU Time
__spmv_coo_serial_host_sse driver 5.420
__spmv_coo_serial_host<unsigned int, double> driver 5.410
read_coo_matrix<unsigned int, double> driver 0.350
test_coo_matrix_kernels<unsigned int, double> driver 0.060
coo_to_csr<unsigned int, double> driver 0.020
csr_to_coo<unsigned int, double> driver 0.020
Executing actions 100 % done
amplxe-cl -report summary -result-dir r001hs
Using result path `/home/zxx/work_autumn_2011/all_format/r001hs'
Executing actions 75 % Generating a report
Summary
-------
Elapsed Time: 11.312
CPU Time: 11.280
Executing actions 100 % done
同collect 后面的。
This example runs the hardware event-based sampling collector for the sample application and displays the default summary report.
$ amplxe-cl -collect-with runsa -knob event-config=CPU_CLK_UNHALTED.CORE,CPU_CLK_UNHALTED.REF,INST_RETIRED.ANYhome/test/sample
比较常用的命令
collect
event-config
knob
$ amplxe-cl -collect-with runsa -knob event-config=CPU_CLK_UNHALTED.CORE,CPU_CLK_UNHALTED.REF,INST_RETIRED.ANYhome/test/sample查看报告时比较特殊
$amplxe-cl -report sfdump -result-dir r000rs
Currently, the only way to view the sample-after values is to display the results of a run with the default values using the 'sfdump' report type, e.g.,
sudo amplxe-cl -collect-with runsa -knob event-config=UOPS_EXECUTED.PORT2_CORE:sa=1000,UOPS_EXECUTED.PORT3_CORE:sa=1000,UOPS_EXECUTED.PORT4_CORE:sa=1000 -- ./driver
以我的经验,sa>=1000,否则机器容易跑死。
我设了100,1,死了2次。
$ amplxe-cl -report hw-events -r r010runsa/
这个report 类型对于原生事件查看结果比较好
This option enables multiple runs to achieve more precise results for hardware event-based collections.
When disabled, the collector uses event multiplexing.
sudo amplxe-cl -collect-with runsa -knob event-config=UOPS_EXECUTED.PORT2_CORE,UOPS_EXECUTED.PORT3_CORE,UOPS_EXECUTED.PORT4_CORE -- ./dr iver
用了 之后,不能跑第二次。
测的结果不太准啊, 郁闷。。。
不知道为什么,一定要学好architecture system and os system.
找出原因来。
- Vtune: amplxe-cl 命令行使用
- VTune利用amplxe-cl进行Hardware Event-based Sampling Analysis 0分析
- amplxe-cl -help command
- amplxe-cl -help collect-with
- amplxe-cl -finalize:用的不多.
- Vtune 使用
- 命令行下使用cl命令设置
- 在命令行界面使用vs2008的 cl 进行编译
- 命令行下使用CL.exe编译多cpp文件工程
- vtune使用笔记
- VTune工具使用心得
- VTune工具使用心得
- vtune使用笔记
- Intel Vtune使用
- linux下通过命令行使用Vtune统计处理器微体系结构特征
- VC 命令行 CL
- cl命令行选项
- cl命令行编译Cpp
- android 通过 包名启动Activity
- ILayerNOde 功能
- hdu1466 计算直线的交点数//dp+set
- 13 如何发布AS3项目
- 初识Ajax
- Vtune: amplxe-cl 命令行使用
- linux GCC 64位编程技巧
- c++ MD5实现
- oracle 日期函数大全!
- 用Visual_C++操作INI文件.doc
- 用console做调试输出窗口(方法一)
- HDFS小文件问题及解决方案
- 垂直直方图
- indigo eclipse