24
May
What Powers Curse
Since we’re having some issues at Curse (hardware now
) and because our technologies page doesnt explain everything, below are more technical details.
Software:
- Apache2 + mod_python for the web servers
- lighttpd + fcgi (mostly for static, we had SEVERE issues using fcgi for the entire Django site, weakref stuff, never found a solution)
- memcached — every web server runs a copy and I believe we have 2gb allocated per-server
- Squid + lighttpd for managing requests between anonymous and logged in users
- Sphinx — our full-text search; the code may not reflect the current version (i’ll update it soon) but check out djangosnippets
- MySQL — until someone can prove us wrong with facts saying PostgreSQL can scale better
Hardware:
- 5 web servers (8 yesterday): 4-8GB of memory, 2x intel core duos, memcached + apache2
- 2 static servers (one runs a small adserver script, another runs our sphinx search daemon): lighttpd + fcgi
- 2 media (download) server (running small python apps to generate images/similar): lighttpd + fcgi
- 2 cache servers: squid + lighttpd
- 2 sql servers (one is inactive but replicated): 2x intel core duos, MySQL, 6GB of memory (being upgraded
) - 1 dev/deployment server: the stats aren’t anything out of the ordinary — it runs various daemons as well as powering cursebeta.com
What happens under the hood:
- The cache servers run both lighttpd and squid, forwarding requests to logged in users past the squid caches.
- The squid caches then round robin to the web servers (which has caused extra stress due to the hardware not matching, we’re working on changing this now).
- The web servers rely heavily on memcached for key components. We cache every django.contrib.contenttypes request, every django.contrib.sites request, etc..
- We have modified various components of Django, with patches similar to what’s linked above, and with changes to select_related and some smaller systems (nothing major).
- Most accessors to common are cached — the method we use is messy so I’ll post more when we change it.
- Anonymous sessions are disabled — we’re still trying to manage a way to enable them and not destroy the website.
- We use a lot of custom middleware, including a custom internationalization backend to handle our URL schemas + handle translations in the database (vs compiled files — yes, we know it’s slower)

20 Responses to "What Powers Curse"
Excellent. I finally managed to get through to the page on the curse site and was bummed not to see the hardware specs.
It’s great that you published this info. You’ve got the highest volume Django installation I’ve heard of.
Nice, real nice!
>we had SEVERE issues using fcgi for the entire Django site, weakref stuff, never found > a solution
Hi David,
Thanks for the info! Could you please tell in more details what kind of issues you ran into with Django Fastcgi?
Thanks,
Peter
Very cool to see your results. I always thought it was more logical and less overhead to run memcache servers locally vs. on dedicated memcahce machines as shown in some of the Django performance example diagrams. Everyone’s needs are different obviously but your post is very insightful.
How do you handle / process your web log files, use a 3rd party to process them, or does some other hardware collect them? With all of the extra caching that would seem more difficult to track. I’ve used both Apache flatfiles / mysql logging but I’ve had my doubts about both ways.
For Statistics: We actually rely on adserver statistics and Google analytics
In response to fcgi we ran into one issue with it throwing random server errors due to some weakref references somewhere in Django’s core. I believe this was related to the MySQLdb library but even when we had pushed a new version of it out we still were having the problems. There were several other problems we had that we had fixed but it was one recurring problem after another.
After many many attempts to find some solution other than mod_python (due to its memory overhead) we finally decided it wasn’t worth the time or trouble to continue to look for solutions to problems that kept popping up.
What OS/Distro are you running?
I often see the complaint by people about mod_python’s memory overhead, but when you query them about it, they more often than not have no basis for the claim and are usually just repeating what someone else has said. Since you have a large site using it and have made this comment, I would be quite interested to here from you directly what basis you have for pointing out the memory overhead of mod_python. As much as we would like to address memory overheads issues in mod_python, it seems no one running real sites ever comes to the mod_python mailing list to share their experiences.
As a reference point for discussion a correctly compiled Python/mod_python should only result in an Apache mod_python.so loadable module of at most about 400 kilobytes. When this gets loaded, the actual memory for the module itself should be shared and so one shouldn’t see a hit on each actual Apache child process, just once for the whole system.
A problem with mod_python.so though is that a lot of Linux distributions don’t provide a shared library for Python in the distribution. This means that when mod_python.so is built, the Python static library objects have to be embedded within mod_python.so. This can add an extra 1.5MB to the size of mod_python.so. Worse is that when this is loaded into memory, it is necessary for the loader to perform address relocations on some platforms and thus rather than mod_python.so being shared, it becomes private memory to every process.
So, first thing you should check is whether your Python provides a shared library and whether mod_python.so is actually using it. If it isn’t, you have already used up about 1.5MB of memory per process more than you need to.
Now when mod_python is loaded and initialised, there is some minor memory overhead in relation to its retaining of configuration information and again some minor overhead from the creation of the initial Python interpreter instance. Both of these should only be a few hundred kilobytes at most.
Where mod_python appears though to chew up a lot of memory initially is that it preloads various Python modules that it requires in order to perform URL dispatch. These include modules such as ‘cgi’, ‘httplib’ and ‘urllib2′ as well as others. In the main though, these are actually modules which would typically be used in a web application anyway and so it isn’t actually overhead specific to mod_python.
Now it has been identified and discussed on the mod_python mailing list that a number of these modules need not actually be loaded at all, as they are loaded for one specific function which could just as easily be duplicated in mod_python. Also, some modules such as ‘pdb’ shouldn’t be loaded unless debugging is being done. In all, it was found that up to 1MB of memory could be saved at startup by eliminating the modules that didn’t need to be loaded.
Although this sounds great, various of these modules would end up getting loaded by the web application anyway, so memory saved might only amount to that taken up by ‘pdb’ which is a few hundred kilobytes. When mod_python 3.3.1 is cleaned up and old module importer removed which is currently available in parallel to old importer, should also be able to save some more memory.
A problem though with these modules being preloaded is that if you don’t run your application in the mod_python ‘main_interpreter’, you will double the amount of memory consumed by these modules. In other words, those loaded into the ‘main_interpreter’ will sit idle and not be used, thus wasting memory.
Thus, if you aren’t specifically needing to run multiple applications separated into distinct Python interpreters, make sure you run your application in the ‘main_interpreter’ by setting the PythonInterpreter directive. This will avoid the memory overhead of an extra Python interpreter and the separate copies of these modules.
When a request actually comes in, this is where memory use starts to climb again. First off, if using Apache worker MPM the whole overhead of Apache creating all the threads themselves does take a noticeable amount of memory.
More importantly, this is where your actual application will get loaded. For Django, the core takes up up to 7MB of memory. On top of that as requests come into specific parts of your application that will keep growing as potentially more and more Python modules get loaded for your customisations.
In summary, the bulk of memory used when using mod_python is still really the Python application itself that is being hosted. If you are not using a Python shared library with mod_python.so, you will waste about 1.5MB. If you don’t run your web application in the ‘main_interpreter’ you can waste up to another 1MB. Since these figures are per process, it will all add up for the system as a whole.
So, mod_python could itself be trimmed by eliminating the need to load certain modules for each interpreter created, but this is going to be at most 0.5MB, and some deprecated code could be removed. The bigger problems come about though through a poorly configured and installed version of Python and not understanding the consequences of using multiple Python interpreters.
Having read all that and perhaps now having a better understanding of where memory gets used with mod_python, maybe you can relate your own experience. Would be particularly interested in here whether you do have a Python shared library being used by mod_python.so and whether you are running your application in the ‘main_interpreter’.
I know this is a long post for a blog comment, so if you want to email me direct about it, or perhaps join the mod_python mailing list and share your experiences there instead then please do so.
Graham
[...] What Powers Curse (DavidCramer.net) A way to scala django. (tags: django scalability architecture infrastructure performance) [...]
Thanks for posting your hardware setup — I’m always interested to hear what high-traffic sites are running.
@Graham: Next big issue I have with mod_python, I’m contacting you:)
Be careful coming to me for mod_python help these days. If all you really need is a way of hosting a WSGI capable application such as Django, I’ll probably try and get you to try mod_wsgi instead even if an official initial version hasn’t been released yet.
Because mod_wsgi is tailored specifically to hosting WSGI applications and is not a general purpose way of writing Python applications in conjunction with Apache, it has less memory overhead than mod_python and also has less run time overhead than mod_python as well.
Maybe one day I might even be able to convince the Curse site maintainers to experiment with mod_wsgi and see if it better meets their needs.
Graham
[...] http://www.davidcramer.net/curse/44/what-powers-curse.html [...]
So… how much does that cost?
Hey… just wanted to throw some links out there re: PostgreSQL scaling.
http://www.computerworld.com.au/index.php?id=760310963
http://tweakers.net/reviews/661/7
http://forums.mysql.com/read.php?25,93181,93181
http://feedlounge.com/blog/2005/11/20/switched-to-postgresql/
http://www.postgresql.org/about/press/presskit82.html.en
The last link has a couple of nice highlights of the most recent release:
“Performance improvements: version 8.2 improves performance around 20% overall in high-end OLTP (online transaction processing) system tests. Users can gain even more in data warehousing efficiency. The changes include faster in-memory and on-disk sorting, better multi-processor scaling, better planning of partitioned data queries, faster bulk loads and vastly accelerated outer joins.
Warm Standby Databases: through an extension to our Point in Time Recovery feature (introduced in version 8.0), administrators now can easily create a failover copy of your database cluster.
Online Index Builds: index builds can now occur while applications write to database tables, allowing performance tuning without downtime.”
Very interesting.. Why did you go with mysql replication instead of mysql cluster with 2 active nodes?
Is there an automatic failover between both database nodes incase of a failure? Are you using DRBD with heartbeat?
Very interesting… maybe I should use something like this for http://pixelspotlight.com/
One of many examples of data comparing PostgreSQL vs MySQL performance. This was from the end of 2006, so it’s recent enough compared to outdated info from 2000-ish showing how fast MySQL is (e.g. especially without transaction support back then). Now that MySQL has been adding features, it’s closer to comparing apples to apples. If you’re interested, since you said you wanted data showing how PostgreSQL scales vs MySQL.
http://spyced.blogspot.com/2006/12/benchmark-postgresql-beats-stuffing.html
My experience with mysql vs postgres is that these days, depending on what you are doing postgres can easily beat mysql on a single box system, but when it comes to replication mysql wins hands down. Slony is a complete dog. Once you move past a single box postgres has severe problems.
[...] read more | digg story [...]
[...] What Powers Curse | David Cramer.net (tags: architecture curse django hardware scaling) [...]
[...] drawback is memory; mod-python has a bad rep for memory (although if you read Grahams comments on this post, you’ll see there are two sides to this [...]
Leave A Reply