跟Google学写代码--Chromium/base--cpu源码学习及应用
来源:互联网 发布:excel比对两列数据 编辑:程序博客网 时间:2024/06/02 02:23
Chromium是一个伟大的、庞大的开源工程,很多值得我们学习的地方。
前面写道:
《跟Google学写代码–Chromium/base–stl_util源码学习及应用》
《跟Google学写代码–Chromium/base–windows_version源码学习及应用》
今天分享cpu相关的操作。
先看看这个枚举:
enum IntelMicroArchitecture { PENTIUM, SSE, SSE2, SSE3, SSSE3, SSE41, SSE42, AVX, MAX_INTEL_MICRO_ARCHITECTURE };
什么是sse?
SSE(Streaming SIMD Extensions)是英特尔在AMD的3D Now!发布一年之后,在其计算机芯片Pentium III中引入的指令集,是MMX的超集。
SSE2
SSE2是Intel在Pentium 4处理器的最初版本中引入的,但是AMD后来在Opteron 和Athlon 64处理器中也加入了SSE2的支持。SSE2指令集添加了对64位双精度浮点数的支持。这个指令集还增加了对CPU快取的控制指令。AMD对它的扩展增加了8个XMM寄存器,但是需要切换到64位模式(AMD64)才可以使用这些寄存器。
SSE3
SSE3是Intel在Pentium 4处理器的 Prescott 核心中引入的第三代SIMD指令集,AMD在Athlon 64的第五个版本,Venice核心中也加入了SSE3的支持。以及对超执行绪技术的支持。
SSSE3
SSSE3是Intel针对SSE3指令集的一次额外扩充,最早内建于Core 2 Duo处理器中。
SSE4
SSE4是Intel在Penryn核心的Core 2 Duo与Core 2 Solo处理器时,新增的47条新多媒体指令集,多媒体指令集,并内建在Phenom与Opteron等K10架构处理器中,不过无法与Intel的SSE4系列指令集相容。
SSE5
SSE5]是AMD为了打破Intel垄断在处理器指令集的独霸地位所提出的,SSE5初期规划将加入超过100条新指令,其中最引人注目的就是三算子指令(3-Operand Instructions)及熔合乘法累积(Fused Multiply Accumulate)。其中,三算子指令让处理器可将一个数学或逻辑函式库,套用到算子或输入资料。借由增加算子的数量,一个 x86 指令能处理二至三笔资料, SSE5 允许将多个简单指令汇整成一个指令,达到更有效率的指令处理模式。提升为三运算指令的运算能力,是少数 RISC 架构的水平。熔合乘法累积让允许建立新的指令,有效率地执行各种复杂的运算。熔合乘法累积可结合乘法与加法运算,透过单一指令执行多笔重复计算。透过简化程式码,让系统能迅速执行绘图着色、快速相片着色、音场音效,以及复杂向量演算等效能密集的应用作业。SSE5最快将内建于AMD下一代Bulldozer核心。
AVX
AVX是Intel的SSE延伸架构,如IA16至IA32般的把暂存器XMM 128bit提升至YMM 256bit,以增加一倍的运算效率。此架构支持了三运算指令(3-Operand Instructions),减少在编码上需要先复制才能运算的动作。在微码部分使用了LES LDS这两少用的指令作为延伸指令Prefix。
cpu.h
由于这个类比较简短,所以就贴上所有的头文件了:
#ifndef BASE_CPU_H_#define BASE_CPU_H_#include <string>#include "base/base_export.h"namespace base {// Query information about the processor.class BASE_EXPORT CPU { public: // Constructor CPU(); enum IntelMicroArchitecture { PENTIUM, SSE, SSE2, SSE3, SSSE3, SSE41, SSE42, AVX, MAX_INTEL_MICRO_ARCHITECTURE }; // Accessors for CPU information. const std::string& vendor_name() const { return cpu_vendor_; } int signature() const { return signature_; } int stepping() const { return stepping_; } int model() const { return model_; } int family() const { return family_; } int type() const { return type_; } int extended_model() const { return ext_model_; } int extended_family() const { return ext_family_; } bool has_mmx() const { return has_mmx_; } bool has_sse() const { return has_sse_; } bool has_sse2() const { return has_sse2_; } bool has_sse3() const { return has_sse3_; } bool has_ssse3() const { return has_ssse3_; } bool has_sse41() const { return has_sse41_; } bool has_sse42() const { return has_sse42_; } bool has_avx() const { return has_avx_; } // has_avx_hardware returns true when AVX is present in the CPU. This might // differ from the value of |has_avx()| because |has_avx()| also tests for // operating system support needed to actually call AVX instuctions. // Note: you should never need to call this function. It was added in order // to workaround a bug in NSS but |has_avx()| is what you want. bool has_avx_hardware() const { return has_avx_hardware_; } bool has_aesni() const { return has_aesni_; } bool has_non_stop_time_stamp_counter() const { return has_non_stop_time_stamp_counter_; } // has_broken_neon is only valid on ARM chips. If true, it indicates that we // believe that the NEON unit on the current CPU is flawed and cannot execute // some code. See https://code.google.com/p/chromium/issues/detail?id=341598 bool has_broken_neon() const { return has_broken_neon_; } IntelMicroArchitecture GetIntelMicroArchitecture() const; const std::string& cpu_brand() const { return cpu_brand_; } private: // Query the processor for CPUID information. void Initialize(); int signature_; // raw form of type, family, model, and stepping int type_; // process type int family_; // family of the processor int model_; // model of processor int stepping_; // processor revision number int ext_model_; int ext_family_; bool has_mmx_; bool has_sse_; bool has_sse2_; bool has_sse3_; bool has_ssse3_; bool has_sse41_; bool has_sse42_; bool has_avx_; bool has_avx_hardware_; bool has_aesni_; bool has_non_stop_time_stamp_counter_; bool has_broken_neon_; std::string cpu_vendor_; std::string cpu_brand_;};} // namespace base#endif // BASE_CPU_H_
Initialize的实现
void CPU::Initialize() {#if defined(ARCH_CPU_X86_FAMILY) int cpu_info[4] = {-1}; char cpu_string[48]; // __cpuid with an InfoType argument of 0 returns the number of // valid Ids in CPUInfo[0] and the CPU identification string in // the other three array elements. The CPU identification string is // not in linear order. The code below arranges the information // in a human readable form. The human readable order is CPUInfo[1] | // CPUInfo[3] | CPUInfo[2]. CPUInfo[2] and CPUInfo[3] are swapped // before using memcpy to copy these three array elements to cpu_string. __cpuid(cpu_info, 0); int num_ids = cpu_info[0]; std::swap(cpu_info[2], cpu_info[3]); memcpy(cpu_string, &cpu_info[1], 3 * sizeof(cpu_info[1])); cpu_vendor_.assign(cpu_string, 3 * sizeof(cpu_info[1])); // Interpret CPU feature information. if (num_ids > 0) { __cpuid(cpu_info, 1); signature_ = cpu_info[0]; stepping_ = cpu_info[0] & 0xf; model_ = ((cpu_info[0] >> 4) & 0xf) + ((cpu_info[0] >> 12) & 0xf0); family_ = (cpu_info[0] >> 8) & 0xf; type_ = (cpu_info[0] >> 12) & 0x3; ext_model_ = (cpu_info[0] >> 16) & 0xf; ext_family_ = (cpu_info[0] >> 20) & 0xff; has_mmx_ = (cpu_info[3] & 0x00800000) != 0; has_sse_ = (cpu_info[3] & 0x02000000) != 0; has_sse2_ = (cpu_info[3] & 0x04000000) != 0; has_sse3_ = (cpu_info[2] & 0x00000001) != 0; has_ssse3_ = (cpu_info[2] & 0x00000200) != 0; has_sse41_ = (cpu_info[2] & 0x00080000) != 0; has_sse42_ = (cpu_info[2] & 0x00100000) != 0; has_avx_hardware_ = (cpu_info[2] & 0x10000000) != 0; // AVX instructions will generate an illegal instruction exception unless // a) they are supported by the CPU, // b) XSAVE is supported by the CPU and // c) XSAVE is enabled by the kernel. // See http://software.intel.com/en-us/blogs/2011/04/14/is-avx-enabled // // In addition, we have observed some crashes with the xgetbv instruction // even after following Intel's example code. (See crbug.com/375968.) // Because of that, we also test the XSAVE bit because its description in // the CPUID documentation suggests that it signals xgetbv support. has_avx_ = has_avx_hardware_ && (cpu_info[2] & 0x04000000) != 0 /* XSAVE */ && (cpu_info[2] & 0x08000000) != 0 /* OSXSAVE */ && (_xgetbv(0) & 6) == 6 /* XSAVE enabled by kernel */; has_aesni_ = (cpu_info[2] & 0x02000000) != 0; } // Get the brand string of the cpu. __cpuid(cpu_info, 0x80000000); const int parameter_end = 0x80000004; int max_parameter = cpu_info[0]; if (cpu_info[0] >= parameter_end) { char* cpu_string_ptr = cpu_string; for (int parameter = 0x80000002; parameter <= parameter_end && cpu_string_ptr < &cpu_string[sizeof(cpu_string)]; parameter++) { __cpuid(cpu_info, parameter); memcpy(cpu_string_ptr, cpu_info, sizeof(cpu_info)); cpu_string_ptr += sizeof(cpu_info); } cpu_brand_.assign(cpu_string, cpu_string_ptr - cpu_string); } const int parameter_containing_non_stop_time_stamp_counter = 0x80000007; if (max_parameter >= parameter_containing_non_stop_time_stamp_counter) { __cpuid(cpu_info, parameter_containing_non_stop_time_stamp_counter); has_non_stop_time_stamp_counter_ = (cpu_info[3] & (1 << 8)) != 0; }#elif defined(ARCH_CPU_ARM_FAMILY) && (defined(OS_ANDROID) || defined(OS_LINUX)) cpu_brand_.assign(g_lazy_cpuinfo.Get().brand()); has_broken_neon_ = g_lazy_cpuinfo.Get().has_broken_neon();#endif}CPU::IntelMicroArchitecture CPU::GetIntelMicroArchitecture() const { if (has_avx()) return AVX; if (has_sse42()) return SSE42; if (has_sse41()) return SSE41; if (has_ssse3()) return SSSE3; if (has_sse3()) return SSE3; if (has_sse2()) return SSE2; if (has_sse()) return SSE; return PENTIUM;}
上面的代码中用到了__cpuid,下面就行介绍介绍。
__cpuid
功能:
Generates the cpuid instruction available on x86 and x64, which queries the processor for information about the supported features and CPU type.
原型:
void __cpuid( int CPUInfo[4], int InfoType);
更多请关注:
https://msdn.microsoft.com/en-us/library/hskdteyh(VS.80).aspx
__cpuidex函数的InfoType参数是CPUID指令的eax参数,即功能ID。ECXValue参数是CPUID指令的ecx参数,即子功能ID。CPUInfo参数用于接收输出的eax, ebx, ecx, edx这四个寄存器。
用条件编译判断VC编译器对Intrinsics函数的支持性(_MSC_VER)。
使用
int main(int argc, char* argv[]) { base::CPU *cpu = new base::CPU(); std::cout << cpu->cpu_brand() << std::endl; system("pause"); return 0;}
输出:
Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz
- 跟Google学写代码--Chromium/base--cpu源码学习及应用
- 跟Google学写代码--Chromium/base--stl_util源码学习及应用
- 跟Google学写代码--Chromium/base--windows_version源码学习及应用
- 跟Google学写代码--Chromium/base--stl_util源码学习及应用
- 跟Google学写代码--Chromium工程中用到的C++11特性
- 跟Google学写代码--Chromium工程中用到的C++11特性(Library Features)
- 跟Google学写代码--Chromium工程中禁止使用的C++11特性
- 跟Google学写代码:使用Fragment构建可变的界面
- 跟Google学写代码:Interacting with Other Apps【Capture Photo from phone】
- 跟Google 学代码:Building Apps with Graphics & Animation
- 跟Google 学代码:Transmitting Network Data Using Volley
- 想用Python学机器学习?Google大神替你写好了所有的编程示范代码
- chromium源码学习
- Chromium Base学习笔记 —— Callback
- Chromium Base学习笔记 —— Weakptr
- 跟Google 学代码 :Building Apps with Content Sharing(跟女神去表白)
- 跟我学写makefile
- 资源 | 想用Python学机器学习?Google大神替你写好了所有的编程示范代码
- 包含B的字符串
- 第5章 动态未知环境下的避障策略
- Struts2(一)——总体介绍
- Android Java Socket实现文件上传(一)——客户端
- LA 5052 Genome Evolution
- 跟Google学写代码--Chromium/base--cpu源码学习及应用
- 春秋争霸第五关
- SpringBoot Velocity toolbox配置
- Struts2(二)——配置文件struts2.xml的编写
- Shell Bash变量
- Resource 方法解析
- Struts2(三)——数据在框架中的数据流转问题
- Problem-2049 不容易系列之(4)—考新郎
- 【JZOJ 4921】 幻魔皇