6

Dec

Filed in Code, Curse, Django |

In preparation for a possible change in Curse’s caching strategy, I took the time today to do some benchmarks of memcached. The benchmarks were taken using the standard python-memcached library as well as 3rd party library for PHP (I couldn’t get Leopard working with the PECL package). It turns out, Python is actually fairly fast in its memcached client.

The idea behind our new strategy, is something that I’ve seen talked about quite a bit (once I started researching), row-level cache objects.

Typically people store a group of objects in a cache, as we are still. You end up with many different groups of many objects. In most cases, you’re going to have overlapping, where you have the same objects in many different caches.

The major problem comes in when you need to do invalidation. How do you invalidate all caches containing X object? There are several solutions around, none of which are very concrete in how they can solve every problem you can possibly think of. Many languages such as .NET also employ their own solutions for caching.

Using Django, with memcached, offers an interesting twist to the row-level object caching. The plan would be to design a “CachedModel” which would simply say “all requests to this model need to hit the cache”. Now that we have a model that is going to cache everything, we need to know how we’re going to cache it, and more importantly, how we’re going to (automatically) invalidate it.

Invalidation has been one of the most difficult tasks we’ve faced in the last year. The original approach was to manually delete keys (yes, not a good idea) that we define in some routine. What the new system would accomplish is automatically updating any cache immediately when an object is changed, or removed.

There are two systems which we must take into consideration:

  • Lists of objects (such as a list of articles)
  • Individual objects (a single article)

Handling invalidation on the single object caches is easy, so why not leave at that? The plan is simply, just that. Invalidation will simply be invalidating the single object in the cache.

To solve the issue of lists of objects needing invalidation we transparently handle this by simply caching a list of references (say its a list of 10 cache keys, which you then do a bulk get on). When you update the title of your article, you simply update its object cache, and any cache that references is going to be pulling in that object cache so it is immediately updated as well.

Removing objects is also fairly simply. When we are polling for our list of references, we’re going to need to see what’s missing. When we find missing objects, we try to get them (query the database). If they fail, our cache is no longer valid. Easy, ya?

All in all, this would potentially solve all of our current caching needs. It doesn’t handle every case (such as invalidating a list of new objects when a new one is added), but those could be handled with dependencies or signals.

If you have your own caching stories, or solutions, please share them! I’d be glad to hear from anyone else who has similar problems, or even larger problems.

  • http://ifdebug.com Al

    It might be worth reading through the documentation for the SqlCacheDependency framework for .NET to see if you can gleam any other useful information for this as well.

  • David

    I was actually looking at .NET’s caching framework — and the SqlCacheDependency is pretty cool.

    We are also going to come up with a solid (hopefully) solution for dependencies. But it seems the best solution is going to be store another object in memory that gives reverse mappings to keys which are dependent on keys.

    You would then simply register a dependency to gain invalidation on it, but you’d automatically gain invalidation when objects were removed from the cache still using this system described above.

  • Ivan Jovanovic

    I’m now exploring around the problem of caching and invalidation of 1-n relations between the items.
    What do you think about some general mechanism of that kind of invalidation that could be provided by memcached for example.
    Could the relations between items and their invalidation upon change be implemented into caching mechanisms, better than into application logic.

    As I see this is a general problem so maybe to solve it in more general place could be more appropriate????

  • http://www.air-jordan-10.com/ air jordan 10

    Well , the view of the passage is totally correct ,your details is really reasonable and you guy give us valuable informative post, I totally agree the standpoint of upstairs. I often surfing on this forum when I m free and I find there are so much good information we can learn in this forum! http://spoon8.net/

blog comments powered by Disqus