memcached hash算法

来源:互联网 发布:入骨相思知不知 番外 编辑:程序博客网 时间:2024/06/06 14:10

From: http://stackoverflow.com/questions/10434375/what-hashing-algorithm-does-memcached-use-to-hash-keys

Question:

Memcached uses distributed consistent hashing to choose which server to put a key on but which hashing algo does it use to map string key into the final hash on which the Ketama algo is applied for server selection. And how good is that algo at spreading similar keys to different servers.


Answer:

According to the source code in hash.c, memcached uses the following algorithm:

The hash function used here is by Bob Jenkins, 1996:

http://burtleburtle.net/bob/hash/doobs.html

"By Bob Jenkins, 1996. bob_jenkins@burtleburtle.net. You may use this code any way you wish, private, educational, or commercial. It's free."

From Bob Jenkins' website:

I offer you a new hash function for hash table lookup that is faster and more thorough than the one you are using now. I also give you a way to verify that it is more thorough.

Also, his requirements are:

  • The keys are unaligned variable-length byte arrays.
  • Sometimes keys are several such arrays.
  • Sometimes a set of independent hash functions were required.
  • Average key lengths ranged from 8 bytes to 200 bytes.
  • Keys might be character strings, numbers, bit-arrays, or weirder things.
  • Table sizes could be anything, including powers of 2.
  • The hash must be faster than the old one.
  • The hash must do a good job.

...

The real requirement then is that a good hash function should distribute hash values uniformly for the keys that users actually use.

To get back to your other question, he measured the ability of the algorithm to uniformly distribute hash values, so I would presume that the hash does a good job at spreading similar keys to different servers. If you have concerns, the code is isolated so you should be able to run your own tests.


原创粉丝点击