I’ve finally completed what I’ll call “phase 1″ of the caching layer. It handles the easiest, and for my cases, the most useful level of cache invalidation: removing objects.
“Phase 1″ Features:
- Automatic caching of querysets.
- Invalidating querysets when an object is removed.
- Caching querysets objects in a key by key basis (per-object caching).
- Automatic invalidation of per-object caches.
Grab the code from Google and look it over if you’re interested. This is still very much in a “It probably doesn’t work correctly” phase, and the code will have a lot of cleanup and structure changes before it’s done.
Here’s a quick rundown of the intended functionality and SQL usage:
class MyModel(CachedModel): [...] MyModel.objects.create(name="Test") [1 query] MyModel.objects.all() [1 query ] [Test,] MyModel.objects.create(name="Test2") [1 query] MyModel.objects.all() [no queries] [Test,] MyModel.objects.create(name="Test3") [1 query] MyModel.objects.all().reset() [1 query] [Test, Test2, Test3] MyModel.objects.get(name="Test2").delete() [1 query] MyModel.objects.all() [2 queries] [Test, Test3] z = MyModel.objects.get(name="Test") z.name = "Test Changed" z.save() [1 query] MyModel.objects.all() [no queries] [Test Changed, Test3]
5 Responses to "Caching Layer for Django ORM"
I like this a lot! When I started developing my django app most pages made like 50 seperate foreign key requests each for a single ID to the database. I even had queries for the same ID.
Because I really have to focus on DB speed I can’t use Django’s select_related function. That’s why I’m now caching foreign key lookups with memcache get_many calls (Just hacked this code direclty editing Django sources for now).
I had most problems with foreign key lookups and obviously count() queries - and it’s amazing to have Django’s ORM to manage all that caching transparently behind the scenes. Great job!
One quick note is that I haven’t actually put much thought into optimizing counts. And it currently doesn’t invalidate them.
David,
While I expect you haven’t done any in depth performance testing - have you given it any sort of a work out to give some sort of an indication of the possible benefits this is going to yield for you?
Al.
We did tests showing a 2x-5x increase in time for a standard cache call of 10-50 objects. It will be a little bit higher of course with the overhead of iterating through loops, but all in all it’s going to provide a means of invalidation and still be far more efficient than SQL.
I had some difficulty getting this to work out of the box with the OneToOneRelation (I know it’s not exactly in wide use, but until I have model-inheritance, it’s the best fit for several aspects of my data model), and with Generic Relations.
After a couple tweaks to the code to get it to run (which could very well have broken it), I was finding on average the number of queries required to generate my pages went down by 1-3, and the effect was negligible. For this test, I simply changed all of my models to inherit from CachedModel instead of models.Model.
Leave A Reply