21

Jan

Filed in Code, Django, Python with 5 comments |

I’ve finally completed what I’ll call “phase 1″ of the caching layer. It handles the easiest, and for my cases, the most useful level of cache invalidation: removing objects.

“Phase 1″ Features:

  • Automatic caching of querysets.
  • Invalidating querysets when an object is removed.
  • Caching querysets objects in a key by key basis (per-object caching).
  • Automatic invalidation of per-object caches.

Grab the code from Google and look it over if you’re interested. This is still very much in a “It probably doesn’t work correctly” phase, and the code will have a lot of cleanup and structure changes before it’s done.

Here’s a quick rundown of the intended functionality and SQL usage:

class MyModel(CachedModel): [...]
 
MyModel.objects.create(name="Test") [1 query]
 
MyModel.objects.all() [1 query ]
[Test,]
 
MyModel.objects.create(name="Test2") [1 query]
 
MyModel.objects.all() [no queries]
[Test,]
 
MyModel.objects.create(name="Test3") [1 query]
 
MyModel.objects.all().reset() [1 query]
[Test, Test2, Test3]
 
MyModel.objects.get(name="Test2").delete() [1 query]
 
MyModel.objects.all() [2 queries]
[Test, Test3]
 
z = MyModel.objects.get(name="Test")
z.name = "Test Changed"
z.save() [1 query]
 
MyModel.objects.all() [no queries]
[Test Changed, Test3]

5 Responses to "Caching Layer for Django ORM"

Subscribe to this topic with RSS or get the Trackback URL
Jan Oberst (Jan 21st):

I like this a lot! When I started developing my django app most pages made like 50 seperate foreign key requests each for a single ID to the database. I even had queries for the same ID.

Because I really have to focus on DB speed I can’t use Django’s select_related function. That’s why I’m now caching foreign key lookups with memcache get_many calls (Just hacked this code direclty editing Django sources for now).

I had most problems with foreign key lookups and obviously count() queries - and it’s amazing to have Django’s ORM to manage all that caching transparently behind the scenes. Great job!

David (Jan 21st):

One quick note is that I haven’t actually put much thought into optimizing counts. And it currently doesn’t invalidate them.

Al (Jan 22nd):

David,

While I expect you haven’t done any in depth performance testing - have you given it any sort of a work out to give some sort of an indication of the possible benefits this is going to yield for you?

Al.

David (Jan 23rd):

We did tests showing a 2x-5x increase in time for a standard cache call of 10-50 objects. It will be a little bit higher of course with the overhead of iterating through loops, but all in all it’s going to provide a means of invalidation and still be far more efficient than SQL.

John (Feb 28th):

I had some difficulty getting this to work out of the box with the OneToOneRelation (I know it’s not exactly in wide use, but until I have model-inheritance, it’s the best fit for several aspects of my data model), and with Generic Relations.

After a couple tweaks to the code to get it to run (which could very well have broken it), I was finding on average the number of queries required to generate my pages went down by 1-3, and the effect was negligible. For this test, I simply changed all of my models to inherit from CachedModel instead of models.Model.

Leave A Reply

 Username (*required)

 Email Address (*private)

 Website (*optional)

Note: Comments moderation may be active so there is no need to resubmit your comment.