优化做缓存用的std::map<std::string, X> （自己的C++小工具系列1）

来源：互联网发布：淘宝图片去同款编辑：程序博客网时间：2024/06/05 23:05

优化做缓存用的std::map<std::string, X> （自己的C++小工具系列1）

朱元

6 个月前

很多时候，一个进程可能会本地缓存一些函数结果，经常就定义一个std::map<std::string, X> cache，然后函数每次返回结果之前就往map里面把数据塞进去，然后函数接下来可能几分钟都直接返回cache里的结果。这些数据，量不大，只是算起来麻烦（可能要去配置数据库或者外部服务获取），用不到memcache或者vanishcache之流去共享内存缓存。在标准库不支持unodered_set或者你又不使用hash_map的年代，这种代码笔者真是见了一堆又一堆。。

不过这里的map真的“高效”吗？

先提一点，这里说的高效并不是去和哈希表做比较，而是指其语义实现和在业务场景是否高效。

笔者想给出“否”的答案。

1. 用strcmp语义来定义全序比较太浪费了，用户并不需要排序或者顺序的遍历，只需要快速地：插入，查找和偶尔需要的删除。

std::map有4个模板参数，第3个类型参数即是用来定义比较谓词的，所以我们可以在上面简单的实现新的适合string做索引的比较操作。

实现：

1. 先比较长度，长的大短的小，这是符合全序关系的。

2. 再把std::string里的内容看成非负整数串来比较：这是安全的，因为通过malloc/new之类分配的内存总是按abi的最大的字节对齐数去对齐的(注意瞅一些吐槽)。然后GLIBC的string实现，在首部放了一颗指针去索引长度和capacity信息。那就意味着，剩余的内存至少是按指针大小对齐的。MSVC的实现对齐度更高。

3.如果长度（这时候两个字符串的长度相同）不是非负整数长度的倍数，那把能整除的那部分比较，和尾部长度余数部分的比较，分开处理。尾部掩去不需要的比特后再进行比较。这样2和3综合起来在某一长度下依然构成了一个全序关系。

这里给一个从高地址到低地址顺序比较的代码（大端系统注意改一处位操作代码，看注释。私人代码毕竟没有做那么多的无修改跨平台兼容。）

#include <string>#include <functional> //MIT LICENCED！struct  string_less: public std::binary_function<std::string,std::string,bool> {//You can use __m128i for sse 128bit compare in x86 & x64, but here we do not use it for convenient.#ifdef __x86_64__typedef  unsigned long long comp_type;#elsetypedef  unsigned int comp_type;#endifbool operator() (const std::string& x, const std::string& y) const {if(x.length() != y.length()){return x.length()< y.length();//inlined function will help us.}const unsigned int word_size    = sizeof(comp_type);std::string::size_type length = x.length();std::string::size_type cmid = length & (~(word_size-1)) ;std::string::size_type mid = cmid / word_size;const comp_type *px = reinterpret_cast<const comp_type*>(x.data());const comp_type *py = reinterpret_cast<const comp_type*>(y.data());if(cmid != length){const unsigned int comp_length_bits =  (length & (word_size-1)) << 3;//for little endianconst comp_type mask_pattern = (1u << comp_length_bits) - 1;//for big endian//const comp_type mask_pattern = static_cast<comp_type>(-1) << ((word_size << 3) - comp_length_bits);const comp_type r1 = px[mid] & mask_pattern;const comp_type r2 = py[mid] & mask_pattern;if(r1 != r2){return r1 < r2;}}while (mid-- != 0){const comp_type r1 = px[mid] ;const comp_type r2 = py[mid] ;if(r1 != r2){return r1 < r2;}}//equal?return false;}};

笔者遇到的大多数需要cache的内容，key长度很少超过20字节（这样就基本没有了相对于sse和rep以及循环展开的劣势），而且经常还在头部有个公共的前缀（这让std::string的字典序比较很受伤。如果有公共后缀？那请练习一下重新实现上述代码，改为从低到高比较吧）。

至少在笔者的工作场景中（key很多，但是往往不长，x64）， std::map<std::string, std::string, string_less>的效率至少是 std::map< std::string, std::string>的2倍。

阅读全文

0 0