中心缓存：CentralCache

来源：互联网发布：华为无线优化工程师编辑：程序博客网时间：2024/05/16 15:21

1.CentralCache的实现

定义： static CentralFreeListPadded central_cache_[kNumClasses];
每个数组元素对应一种size class的分配请求
上述数组中每个元素，即CentralFreeList结构，只不过CentralFreeListPadded是CentralFreeList的一种对齐实现。

结构图如下：

这里写图片描述

2.CentralFreeList的实现

CentralCache中每个size class对应的结构（CentralFreeList），包含了三种空间管理结构：tc_slots[kMaxNumTransferEntries]、nonempty_链表、empty_链表：

TCEntry tc_slots_[kMaxNumTransferEntries];
struct TCEntry {
void *head; // Head of chain of objects.
void *tail; // Tail of chain of objects.
};
int32_t used_slots_;
int32_t cache_size_;
int32_t max_cache_size_;
tc_slots_是一个数组，数组的每一个元素都是一个链表，链表的每一项都是与size class大小相对应的空闲内存空间，且链表中元素个数固定。
为什么链表中元素的个数都是固定的呢？
因为ThreadCache每次向CentralCache申请空间或归还空间时大多是一次申请或归还num_objects_to_move_[kNumClasses]个可用空间（对于某个size class，每次从ThreadCache返回给CentralCache的对象数，都是由num_objects_to_move_[kNumClasses]所规定的），因为多个ThreadCache向CentralCache申请空间或归还空间需要加锁，一次拿一个效率太低。
used_slots_反应的是当前tc_slots_中有多少空闲链表可用，换句话说就是tc_slots_数组有效元素个数。
cache_size_是对该CentralFreeList在tc_slots_上缓存大小的限定，不能无限制的缓存，当used_slots_超过cache_size_时就不能将归还的空间挂在数组中做缓存用了，需要将各个空间归还给各自的span，span在合适时候在将整个span归还给PageHeap;
max_cache_size_是cache_size_所能取得上限，因为cache_size_的值不是固定的，是根据情况动态变化的。也就是说cache_size_和max_cache_size_都是对tc_slots_数组有多少项可用所做的下标限定。
Span empty_; // Dummy header for list of empty spans
当一个span的全部空间都被分配完了的时候，就把它挂在empty_链表上
Span nonempty_; // Dummy header for list of non-empty spans
当一个span只有部分空间从span分配出去时，就将它挂在noempty_链表上
size_t num_spans_; // Number of spans in empty_ plus nonempty_
该CentralFreeList一共从PageHeap申请了多少个span

3.ThreadCache向CentralCache相应的CentralFreeList申请空间时

首先从tc_slots[kMaxNumTransferEntries]中申请空间，否则才从CentralCache中，对应的size class的nonempty链表中分配。若还不能满足分配需求，则会向PageHeap申请。
CentralCache从PageHeap申请的span，会加入到相应size class的nonempty_链表中
与此同时，完成对span的格式化，即划分成若干个object，相邻object间使用指针连接

4.ThreadCache向CentralCache相应的CentralFreeList归还空间时

ThreadCache向CentralCache返还空间时，若返回的对象数量少于num_objects_to_move_[kNumClasses]所规定的，或者tc_slots已经满了，就向CentralCache中对应的nonempty_返还，否则放入tc_slots中
放入nonempty_时，一个一个object地归还，即逐个将object返回到其各自所属的span中。
当一个span的空间，没有被任何线程使用，或者也没有放到tc_slots中，则需要将该span归还给PageHeap。

5.细节分析

void CentralFreeList::Init(size_t cl) {    ....      max_cache_size_ = kMaxNumTransferEntries;#ifdef TCMALLOC_SMALL_BUT_SLOW  // Disable the transfer cache for the small footprint case.  cache_size_ = 0;#else  cache_size_ = 16;#endif    //bytes是该CentralFreeList一个 object的空间大小是多少个字节    int32_t bytes = Static::sizemap()->ByteSizeForClass(cl);    //objs_to_move代表tc_slots上每一个链表有多少个object，一个obj有bytes个字节    int32_t objs_to_move = Static::sizemap()->num_objects_to_move(cl);    //1M的空间/tc_slots上一项包含的空间大小，在和初始值作比较得出新的max_cache_size_     max_cache_size_ = (min)(max_cache_size_,                          (max)(1, (1024 * 1024) / (bytes * objs_to_move)));    cache_size_ = (min)(cache_size_, max_cache_size_);    ....}

void CentralFreeList::InsertRange(void *start, void *end, int N) ;ThreadCache每次向CentralCache对应size class的CentralFreeList归还object时调用的就是这个接口，该函数逻辑很简单：1.当归还object个数刚好等于num_objects_to_move(size_class_)，且tc_slots还有空位时则直接将归还的N个object的链表挂入tc_slots中即可2.否则调用ReleaseListToSpans(void* start)，通过ReleaseToSpans(void* object)将链表上的每一个object还给对应的span

void CentralFreeList::ReleaseToSpans(void* object) {  //如何找到object对应的span呢？后面会分析，原理就是为每个已分配的页都放入一颗基数树中    Span* span = MapObjectToSpan(object);    ....      // Release central list lock while operating on pageheap    lock_.Unlock();    {    //当一个span上的所有object全归还时，就把该span归还给pageheap，这个时候可以暂    //时释放掉CentralFreeList的锁      SpinLockHolder h(Static::pageheap_lock());      Static::pageheap()->Delete(span);    }    lock_.Lock();    .... }

int CentralFreeList::RemoveRange(void **start, void **end, int N);ThreadCache每次从CentralCache申请object时调用的就是这个接口;1.当申请的object个数刚好等于num_objects_to_move(size_class_)且tc_slots有空闲的object的链表可用，则直接从tc_slots取走一项即可。2.否则   result = FetchFromOneSpansSafe(N, start, end);  if (result != 0) {    while (result < N) {      int n;      void* head = NULL;      void* tail = NULL;      n = FetchFromOneSpans(N - result, &head, &tail);      if (!n) break;      result += n;      SLL_PushRange(start, head, tail);    }  }从后面对FetchFromOneSpansSafe和FetchFromOneSpans的分析来看，有可能返回给ThreadCache的object个数并不为N。

int CentralFreeList::FetchFromOneSpansSafe(int N, void **start, void **end) {  int result = FetchFromOneSpans(N, start, end);  if (!result) {    //说明span中已无可分配的    Populate();    result = FetchFromOneSpans(N, start, end);  }  return result;}

int CentralFreeList::FetchFromOneSpans(int N, void **start, void **end);逻辑很简单：1.如果nonempty_链表为空，则说明当前CentralFreeList无span可分配object，返回0；2.非空，则从nonempty_链表上的第一个span分配N个元素，如果该span剩余object不够这N个则把该span剩下的元素全分配出去并返回实际分配个数。

Populate是该CentralFreeList没有可用的object了，需要Populate从pageheap拿一个新的span来提供可分配的object

// Fetch memory from the system and add to the central cache freelist.void CentralFreeList::Populate() {  // Release central list lock while operating on pageheap  lock_.Unlock();  const size_t npages = Static::sizemap()->class_to_pages(size_class_);  //每种size_class申请的span所包含的页数是不同的，很容易理解，size_class越大的当然span包含的页数就越多  Span* span;  {    SpinLockHolder h(Static::pageheap_lock());    span = Static::pageheap()->New(npages);    if (span) Static::pageheap()->RegisterSizeClass(span, size_class_);  }  for (int i = 0; i < npages; i++) {    Static::pageheap()->SetCachedSizeClass(span->start + i, size_class_);  }  ....  //剩下来的工作就是对刚拿到的span进行格式化，将span空间分成一个个object，每个object用链表串起来}

4.tc_slots从别的CentralFreeList偷空间

tc_slots的可用空槽个数始终受cache_size_的限制，当一个CentralFreeList发现自己的cache_size_已经全被用满时，它会依次从其他CentralFreeList偷取，通过减少别人的cache_size_以增加自己的cache_size_

bool CentralFreeList::MakeCacheSpace() {  // Is there room in the cache  if (used_slots_ < cache_size_) return true;  // cache_size_已经是最大值，不能在偷了，返回false代表没可用空槽了  if (cache_size_ == max_cache_size_) return false;  // 从其他size class对应的CentralFreeList去偷  if (EvictRandomSizeClass(size_class_, false) ||      EvictRandomSizeClass(size_class_, true)) {  //偷成功后增加自己的cache_size_    if (cache_size_ < max_cache_size_) {      cache_size_++;      return true;    }  }  return false;}

bool CentralFreeList::EvictRandomSizeClass(    int locked_size_class, bool force) {  static int race_counter = 0;  int t = race_counter++;  // Updated without a lock, but who cares.  if (t >= Static::num_size_classes()) {    while (t >= Static::num_size_classes()) {      t -= Static::num_size_classes();    }    race_counter = t;  }  //它会通过race_counter这个静态变量依次去偷各个CentralFreeList的tc_slots  if (t == locked_size_class) return false;  return Static::central_cache()[t].ShrinkCache(locked_size_class, force);}

被偷的CentralFreeList执行ShrinkCache来减小自己的cache_size_，force代表是否强制偷，如果不强制，只有当被偷的CentralFreeList的used_slots_ < cache_size_时才偷成功。如果是强制，当used_slots_ == cache_size_时，通过ReleaseListToSpans来归还tc_slots_上的一项。

bool CentralFreeList::ShrinkCache(int locked_size_class, bool force)    NO_THREAD_SAFETY_ANALYSIS {  // Start with a quick check without taking a lock.  if (cache_size_ == 0) return false;  // We don't evict from a full cache unless we are 'forcing'.  if (force == false && used_slots_ == cache_size_) return false;//在偷的期间释放发出偷请求的CentralFreeList的锁，而对被偷的CentralFreeList加锁  LockInverter li(&Static::central_cache()[locked_size_class].lock_, &lock_);  if (cache_size_ == 0) return false;  if (used_slots_ == cache_size_) {    if (force == false) return false;     cache_size_--;    used_slots_--;    //将tc_slots_上缓存的object还给对应的span    ReleaseListToSpans(tc_slots_[used_slots_].head);    return true;  }  cache_size_--;  return true;  //偷完毕后释放被偷的CentralFreeList的锁，而对发出偷请求的CentralFreeList重新加锁}

注意：

此处用偷可能并不恰当，其实我们偷的并不是其他CentralFreeList的空间，我们只是减少别的CentralFreeList的缓存上限来增加自己的缓存上限，这样做的好处就是如果某个size对应的CentralFreeList使用的很频繁，我们可以增加其缓存的最大上限，这样分配或者释放操作将会快很多。同时我们减少其他使用不频繁的CentralFreeList的缓存上限，这样使总的central_cache的缓存上限基本维持不变。

阅读全文

0 0