Keeping memcache consistent

来源:互联网 发布:编程珠玑讲什么的? 编辑:程序博客网 时间:2024/06/07 14:08
 

As an afterthought someone decided at the last minute, that maybethe architect (me) should be on the architectural review of a product.

Normally for social networking web development, I allow for a littleshort term inconsistency. This is because only one user has access tomodify a thing and that user isn’t likely to do two things at the sametime. Because of this, concurrency is almost never a problem and. evenif the data gets clobbered, the database at least is consistent andyour objects are quickly fixed.

The problem with this particular project is that since a paid goodis involved and many users will race to the same datastore—inconsistencies can occur and they’d be more harmful than a gotostatement. The solution proposed was to build a Java service to keepthese eight pieces of data consistent. There was also a release plan inorder to estimate the resource allocation for the new service underlive site load.

Though late to the meeting, I opened my mouth and said, “You don’tneed a Java service to do this. You can do it all in PHP and memcache.”

 

Why they didn’t think it was possible

Long before I joined the company, there was a system to preventstampeding by having a lock key in memcache. This didn’t work out sowell.

Stampeding is what occurs have 100,000 concurrent users and amemcache key for a fairly popular piece of data (say the block list foryour web application, or the ad unit for the banner ads) expires(because of a version increment or an expiration). Tons of concurrentprocesses will see the data is missing and will stampede the databasewith the same request. Databases are slow—which is why we have memcachein the first place—and your site experiences a very nasty hiccup everytime this happens.

One problem was that it is buggy. It used a nonce/semaphore to look a lot like this example.Well like that but if written by someone who just learnedobject-oriented program, then the locking code was added by someoneelse who had just read a design patterns book and then the bugs wereclosed by someone else who was so lazy that they prefer to patchproblems on the user interface layer. It looked that way because it waswritten that way.

I always say that the sites the founders worked on before this had a business plan that was the internet-equivalent of a drive-by shooting.

Our codebase reflected that attitude.

So the reason we got burned by concurrency issues in memcache wasn’tbecause that it’s fundamentally broken, it was simply because it waseasier to rewrite than to pretend that jenga-ing this stuff with a patch on a patch on a patch was going to magically make the code more stable.

And the reason I removed it in the rewrite wasn’t because that it’sthat difficult to write, but rather that we were using this locking forevery single object that we stored in memcache, even the less busyones. Anything built in would be abused similarly. Better to not givethe developers any rope they can use to hang the site with.

(In the case of stampedes, I figured the key is very likely to betoo “hot” anyway. Instead I built a system to allow easy storage of hotkeys in the shared memory cache of the webserver, instead of usingmemcache and needing a network call.)

What was the bug?

None of this is typically understood by anyone else. The reason issimply that the average engineer has been working here 1/20th the timeI have. And those that have been working here half the time—pretty muchall the rest—only know that I pulled the locking code out.

The bug in our code (and the above link) is that there is a race gap between the memcache::get() and memcache::set().This is perfectly fine if all you want to do is prevent a stampedesince only a few, even on a slow system like PHP, would be in thewinners circle of that race. This sort of thing is really bad in theabove case.

So what is the solution? The solution is to use a memcache command that is more atomic. The one that fits the bill is memcache::add() which, if a key is already added returns a failure condition.

The codez

For you script kiddies, here is the code. I don’t know if it works since I wrote it without any unit testing. :-)

view source
print?
01// {{{ locked_mecache_update($memcache,$key,$updateFunction,$expiryTime,$waitUTime,$maxTries)
02/**
03 * A function to do ensure only one thing can update a memcache at a time.
04 *
05 * Note that there are issues with the $expiryTime on memcache not being
06 * fine enough, but this is the best I can do. The idea behind this form
07 * of locking is that it takes advantage of the fact that
08 * {@link memcache_add()}'s are atomic in nature.
09 *
10 * It would be possible to be a more interesting limiter (say that limits
11 * updates to no more than 1/second) simply by storing a timestamp or
12 * something of that nature with the lock key (currently stores "1") and
13 * not deleitng the memcache entry.
14 *
15 * @package TGIFramework
16 * @subpackage functions
17 * @copyright 2009 terry chay
18 * @author terry chay <tychay@php.net>
19 * @param $memcache memcache the memcache object
20 * @param $key string the key to do the update on
21 * @param $updateFunction mixed the function to call that accepts the data
22 *  from memcache and modifies it (use pass by reference).
23 * @param $expiryTime integer time in seconds to allow the key to last before
24 *  it will expire. This should only happen if the process dies during update.
25 *  Choose a number big enough so that $updateFunction will take much less
26 *  time to execute.
27 * @param $waitUTime integer the amount of time in microseconds to wait before
28 *  checking for the lock to release
29 * @param $maxTries integer maximum number of attempts before it gives up
30 *  on the locks. Note that if $maxTries is 0, then it will RickRoll forever
31 *  (never give up). The default number ensures that it will wait for three
32 *  full lock cycles to crash before it gives up also.
33 * @return boolean success or failure
34 */
35function locked_memcache_update($memcache, $key, $updateFunction, $expiryTime=3, $waitUtime=101, $maxTries=100000)
36{
37    $lock = 'lock:'.$key;
38 
39    // get the lock {{{
40    if ($maxTries>0) {
41        for ($tries=0; $tries< $maxTries; ++$tries) {
42            if ($memcache->add($lock,1,0,$expiryTime)) { break; }
43            usleep($waitUtime);
44        }
45        if ($tries == $maxTries) {
46            // handle failure case (use exceptions and try-catch if you need to be nice)
47            trigger_error(sprintf('Lock failed for key: %s',$key), E_USER_NOTICE);
48            return false;
49        }
50    } else {
51        while (!$memcache->add($lock,1,0,$expiryTime)) {
52            usleep($waitUtime);
53        }
54    }
55    // }}}
56    // modify data in cache {{{
57    $data = $memcache->get($key, $flag);
58    call_user_func($updateFunction, $data); // update data
59    $memcache->set($key, $data, $flag);
60    // }}}
61    // clear the lock
62    $memcache->delete($lock,0);
63    return true;
64}
65// }}}

(Yes, I this commenting is typical when I code, hope I could say thesame for you.) The reason it’s a function is so that an engineer has todo work to use it. If they got it for free, they’d abuse it—or, atleast that was the worry.

If you need to do locking on the database, then you would have the$updateFunction nest something that will handle a database update. Youmight want to up the $expiryTime too, but you probably won’t need to—Ijust chose 3 because Touge did in his original post. :-)

 

 

from: http://terrychay.com/article/keeping-memcache-consistent.shtml

原创粉丝点击