Coding

The Curious Case of The Missing Memcached::casMulti()


I understand I’m a little late to the game, but Memcached is friggin’ awesome.

(No, I’m not talking about Memcache, I’ve known about that for a while. I’m talking about Memcached, the younger, beefier, stupidly-named cousin of Memcache. Crazy new features, but named in a way that’s sure to guarantee dozens of Google search mismatches. Nice.)

Previously, if you wanted to be sure that your cache writes weren’t clobbering each other, you had to implement some form of locking system, which introduces the very healthy chance of deadlocks and locking timeouts. This kills the concurrency.

But with Memcached, you can do this really sweet thing called “cas()” which stands for “compare and set”. (Maybe. Depending on who you’re talking to.)

Simple usage:

$cas = null;
$m = new Memcached();
do {
    // Get user, get cas token byref
    $user = $m->get('user::4099', null, $cas);
    
    // Do something with your user
    $m->cas($cas, 'user::4099', $user);
} while ($m->getResultCode() != Memcached::RES_SUCCESS);

So when you’re getting your $user, you’re also getting a $cas token. This token is updated every time the cache entry is changed. When you go to write your user later, instead of calling set(), you call cas() and pass in your $cas token. Memcached will only update the user if $cas is equal to the cas token in memory for that cache entry. If it fails, it sets Memcached’s internal result code to RES_DATA_EXISTS.

Now obviously you’ll want to limit that do-while loop there to prevent infinite looping, and if you really wanted to be a speed demon, you could change that “null” in there to a callback function so this function would block less, but still – it’s pretty goddamn quick, and no need to wait for locks.

There’s just one problem. What if you need to work with MULTIPLE objects?

See, there’s a getMulti() and a setMulti(), but no casMulti(). This is even weirder, because getMulti() even passes back an array of $cas tokens. Googling for it produces complaints about its lack, or worse, people confused about the difference between Memcache and Memcached. Apparently it’s not even IN the Memcached server, which sort of blows my mind. I can think of all kinds of situations where you’d want two or more objects to be updated in-step.

So anyway, since this thing doesn’t exist natively, I wrote it.

Memcached::casMulti()

/**
 * Set an array of Memcached keys with cas tokens, fail on all if any don't match.
 *
 * @param array   $items      An array of items to store.  All items require "key", "value", and "cas" entries.
 * @param integer $expiration The expiration time, defaults to 0.
 */
function casMulti(array $items, int $expiration = 0) {
    $newCasToken = null;
    $successfullySet = array();
    
    // First load old versions of data, ensure cas values remain sound
    foreach ($items as $item) {
        $item['old_value'] = $this->get($item['key'], null, $newCasToken);
        
        // If any tokens have changed since we loaded them, WHOOPS
        if ($newCasToken !== $item['cas']) {
            throw new DataChangedException("Data changed while we were messing about.");
        }
    }
    
    foreach ($items as $item) {
        // On successful set
        if ($this->cas($item['cas'], $item['key'], $item['value'], $expiration)) {
            // Get new cas token
            $this->get($item['key'], null, $item['cas']);
            
            // Keep track of our successes
            $successfullySet[] = $item;
        
        // Failed set?
        } else {
            // Data has been modified, crap
            if ($this->getResultCode() === Memcached::RES_DATA_EXISTS) {
                // Loop through "successful" sets
                foreach ($successfullySet as $item) {
                    // Get new cas token
                    $this->get($item['key'], null, $newCasToken);
                    
                    // If this item hasn't changed on us
                    if ($newCasToken !== $item['cas']) {
                        // Undo our changes
                        $this->cas($newCasToken, $item['key'], $item['old_value'], $expiration);
                    }
                }
                
                throw new DataChangedException("Data changed while we were messing about.");
                
            // Some other reason?
            } else {
                // Do something appropriate
                throw new Exception("Something else messed up.");
            }
        }
    }
}

Some caveats:

  • This function assumes it’s part of a Memcached library, ideally one that extends PHP’s Memcached object, and thus $this is equal to a valid Memcached instance.
  • Since I don’t have access to Memcached’s Result Object (there’s no setResultObject), it relies on Exceptions to report failures, and those pretty obviously haven’t been fleshed out.
  • This is the biggie: I haven’t tested it yet. Like, not at all. I’m busy trying to set up a Virtual Machine to replace the shitty XAMPP-on-Windows thing I’ve been fighting with for years, but it’s a slow process. This will of course be updated once I get a chance to test it out, but I had an itch one evening and just had to scratch it and write this code. (I’ll also be updating this blog with the process used to create said Virtual Machine, should you choose to try it out yourself.)

With that said, let’s walk through it quickly.

You’re going to be passing in an array of items to compare-and-set, and that array should look something like:

array(
    [0] => array(
        "key" => "user::4099",
        "value" => "json-user-data-goes-here",
        "cas" => "big-long-cas-token-taken-from-get-or-getMulti",
    ),
    [1] => array(
        "key" => "user::4105",
        "value" => "json-user-data-goes-here",
        "cas" => "big-long-cas-token-taken-from-get-or-getMulti",
    )
)

I could have put $expiration in each of the items too, and maybe that’s a good idea? I figured keeping as close to Memcached’s existing methods was worthwhile, and how often are you going to need to have different expirations for everything?

Next, we get the old values of everything we’re changing, just in case we need to roll back. If any values have changed since before this function was called, SHUT DOWN EVERYTHING.

Then we try to cas each key individually. If a key is successfully set, we save it for later. If they’re all successfully set, no biggie. If any fail though, we go into recovery mode.

In recovery mode, we examine every previously-set item. If it hasn’t changed, we use cas() again to change it back to the way we found it before we tried anything. If it HAS changed, we just leave it – no sense in messing with anyone else’s work.

Yeah, so…

Now like I said, this is absolutely untested. And not in an “untested under load” kind of way – I mean I haven’t even run it yet. It might not even friggin’ compile. I’ll update this blog post once I have tested it sure, but for now, exercise caution.

And even if it does compile and seems to work, exercise caution. I wrote this late at night after a day of frustratedly poking at the Memcached docs and Google. There could be all kinds of situations where this Just Doesn’t Work.

And even if it does work in all situations as expected? It may not even be a good idea. There’s a LOT of cache reads going on here, and I really don’t know what that does to concurrency. There may very well be much more efficient ways of doing this.

So be warned. Have fun playing around with this, but try not to use it in production code.

Coding
Rage-quit support for fish shell
Coding
Code faster with simple Sublime Text improvements
Coding
Gulp.js – an AMAZING build system!
There are currently no comments.