Cache的设计和实现 LRU Cache

来源：互联网发布：网络骂人的歌曲最火的编辑：程序博客网时间：2024/05/21 10:45

Cache的设计和实现 LRU Cache

Cache的应用非常广泛，其实Cache是一种思想，一个广义的词汇，一种在性能和金钱的权衡上的思想。

Cache的思想用在很多地方，使用的载体也不同，都是位于相对高速设备和相对低速设备之间，起到缓存的作用。

1、最常用处：用在内存和CPU之间，以SRAM作为Cache的载体。（内存是DRAM，CPU是寄存器）

2、用在内存和硬盘之间。

3、用在硬盘和网络之间。

Cache的操作方法：

1、外部操作：数据访问，Get和Set。高速设备要找一个数据，先从Cache找，如果有就直接用；如果没有则去低速设备找，并调入到Cache。

2、内部操作：Cache更新。Cache的大小必定是有限的。所以需要一个淘汰算法（使得命中率尽可能高），最常用的是LRU。当Cache满了却又要加入新的数据，就需要淘汰旧的数据。

Cache的典型实现：

LRU的典型实现是 hash map + double linked list

为什么要用两种数据结构？

理由：hash表是用来访问数据的，因为hash的查找速度快，O(1)。双向链表是用来记录最近访问情况的，因为Hash做不到排序。

如果访问键值为key的数据已经在Cache中，那就访问hash中的该数据，同时要将存储该key的结点移到双向链表头部。

如果访问的数据不在Cache中：

若此时Cache还没满，那么我们将新结点插入到链表头部，同时在哈希表添加（键值，结点地址）。

若此时Cache已经满了，我们就将链表中的最后一个结点(注意不是尾结点)的内容替换为新内容，然后移动到头部，更新哈希表中对应的（键值，结点地址）。

查询操作的时间是O(1)，直接通过hash找到节点，而不是通过链表顺序查找的。

对链表的操作插入操作和移动操作也都是O(1)。

LRUCache简单实例:

实现get(key) 函数：如果key对应的value存在则返回value，否则返回-1。 Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.

实现set(key, value)函数：设置或插入value值。 Set or insert the value if the key is not already present. When the cache reached its capacity, it should invalidate the least recently used item before inserting a new item.

具体实现方法：

哈希表用map实现，结点存储的是key和对应的双向链表结点指针。

双向链表用list实现，结点存储的是key和value。

这样，由key可以用哈希表快速找到链表结点，进行数据访问和结点位置更新。

class LRUCache{public:    LRUCache(int capacity) {        cap = capacity;        size = 0;        hash_table.clear();        double_list.clear();    }        int get(int key) {        map<int, list<pair<int, int> >::iterator>::iterator it = hash_table.find(key);        if(it != hash_table.end())  //update double_list        {            int value = (*( (*it).second ) ).second;             double_list.splice(double_list.begin(), double_list, (*it).second);            return value;        }        else //Not in Cache        {            return -1;            //In fact, we should go to another place to get it.        }    }        void set(int key, int value) {        map<int, list<pair<int, int> >::iterator>::iterator it = hash_table.find(key);        if(it != hash_table.end()) //update double_list and hash_table's value        {            double_list.splice(double_list.begin(), double_list, (*it).second);        (*( (*it).second ) ).second = value;        }        else        {            if(size == cap) //invalidate the lru one            {                int oldkey = double_list.back().first;        list<pair<int, int> >::iterator oldone = double_list.end();        oldone--;                double_list.splice(double_list.begin(), double_list, oldone);                                hash_table.erase(oldkey);        (*double_list.begin()).first = key;        (*double_list.begin()).second = value;                hash_table[key] = double_list.begin();            }            else // insert new one to cache            {                double_list.push_front(pair<int, int>(key, value));                hash_table[key] = double_list.begin();                size++;            }        }    }    private:        //注意：对双向链表的操作不要使迭代器失效        map<int, list<pair<int, int> >::iterator> hash_table; //存储：(关键词key , 双链表中的结点指针)        list<pair<int, int> > double_list; //结点存储：(key, value)        int size;        int cap;};

注意细节：

对双向链表list的修改绝对不可以使原有的迭代器失效，否则hash里存储的内容就失效了。

这里使用的是splice方法来将中间的某个结点移动到首位置。

list::splice实现list拼接的功能。将源list的内容部分或全部元素删除，拼插入到目的list。

函数有以下三种声明：

void splice ( iterator position, list<T,Allocator>& x );  // void splice ( iterator position, list<T,Allocator>& x, iterator i );void splice ( iterator position, list<T,Allocator>& x, iterator first, iterator last );

函数说明：在list间移动元素：

将x的元素移动到目的list的指定位置，高效的将他们插入到目的list并从x中删除。目的list的大小会增加，增加的大小为插入元素的大小。x的大小相应的会减少同样的大小。前两个函数不会涉及到元素的创建或销毁。第三个函数会。指向被删除元素的迭代器会失效。

参数：position目的list的位置，用来标明 插入位置

x源list、first,lastx里需要被移动的元素的迭代器。区间为[first, last).包含first指向的元素，不包含last指向的元素。

0 0