1

Dec

Filed in Code, Django, Python |

One of the optimizations (if you want to call it that) that can be done for decreased load-time on a web page, is removing excess white space. In many of our pages at iBegin this saved as much as 100kb. While not every site has some of the 300kb pages that we have, it can still add up very quickly.

Django by default provides a {% spaceless %} tag in your templates which will allow you to achieve this effect, but we don’t use Django’s template engine (Hooray, Jinja!). The tag approach also seemed fairly inappropriate, as we just want to do it across the entire site, no matter what. Instead, we moved it into a middleware, which simply strips all whitespace (using the same method as the built-in tag) from any HTML page.

While this makes some pages unreadable, the time it saves downloading the page for the average user (See Steve Souders’ tips) is well worth the trouble it causes a few people.

# Aliasing it for the sake of page size.
from django.utils.html import strip_spaces_between_tags as short
 
class SpacelessMiddleware(object):
    def process_response(self, request, response):
        if 'text/html' in response['Content-Type']:
            response.content = short(response.content)
        return response

Note, that if you use page caching middleware, placing this in the appropriate place will also allow you to save space in your cache.

Enjoy!

  • Very nice site! cheap cialis http://opeaixy.com/qsqaxa/4.html
  • lericson
    There are bigger fish to fry.
  • Martyn Clement
    I suggest this tag should be effective only on setting debug=False
    Debugging html on viewing html source code is quite common.
    That's what I did ;)
  • Shame it messes up the pre and textarea tags :( was gonna use it otherwise.

    Intersting idea though!
  • Shame it messes up the
     and  tags :( was gonna use it otherwise.

    Intersting idea though!
  • Hello,

    I'm curious: why outputting whitespace-less HTML when you could just use gzip compression on the server for HTML, XML, CSS and JavaScript, and get better results (like from 300 kB -- not kb ;) -- to, say, 120 kB)? Did you do some performance testing regarding that?

    Unless gzipping your text output has a clear performance cost, I would keep my HTML code with whitespace (if not perfect, then at least decently readable code). Makes it easier to debug when you need to check the real thing (Firebug only shows what Firefox understands, not what it gets from the server).

    Do you happen to use this method, then gzip (through mod_deflate for instance)? If so, how much do you save compared to a gzipped version with all whitespace intact?
  • We GZIP as well. Sadly, everyone's still not on broadband today, so even shaving off 5 or 10k from the request can be quite useful (especially when the amount of time it takes to do that is immeasurable).
  • Very nice and simple trick. Thanks
  • I wrote this snippet:
    http://www.djangosnippets.org/snippets/1055/
    to (very) efficiently whitespace optimize inline CSS which might be interesting for you and your Jinja too.
  • Roger
    The appropriate place for cached content to be effected would be at the start of the end? :-/
  • Roger
    or the end*
  • bartTC
    The drawback of this is, that it also steals the whitespace in pre and textarea tags.
  • bartTC
    Ups, I wasn't aware that there is a real "strip_spaces_between_tags" function. Sorry for the noise :D
  • Have you confirmed this? I believe the whitespace tag is designed to avoid that, but I haven't fully tested it.
  • The whitespace tag is really simple, it just finds the end of one tag and the beginning of another with whitespace in between, and removes said whitespace. It looks like it doesn't have any special cases, so it unfortunately affects textarea and pre tag content.

    see: http://code.djangoproject.com/browser/django/tr...
  • bartTC
    I had problems with highlighted (pygments) text in pre-tags which results in unindented code *cry*. But that's a very special condition.
  • Note that Content-Type will usually be 'text/html; charset=utf-8', not just 'text/html' (since Content-Type without a charset is unsafe).
  • Andreas
    Clever trick! Debuggers could always use firebug if they want it indented nice. Everyone should do this + gzip + expires header if plausible
  • Tom
    So... every third byte was whitespace? That's a bit hard to believe, even as a worst-case. Still, if you saw improvements, then my congratulations.

    I would suggest, though (as I see others have), that people facing the same problem have a look at simply gzipping their text-based output at the webserver level. Since the action's probably happening on the webhead either way, it's unlikely to be much more costly in CPU terms, and it's even simpler to implement than your solution (it typically just takes a line or two in .htaccess if you're using Apache). Gzipping will certainly remove any overhead that whitespace represents.
  • It's mostly indentation that actually contributed to the whitespace :)
  • Very interesting thoughts. I'll have to play around with it sometime. Thanks!
  • Dan
    I've considered this before, but wondered if there was much benefit over gzip (which I would recommend using even with whitespace stripping). Does a whitespace-removed, gzipped response save enough space to justify the (admittedly reasonable) CPU cost of whitespace-removal?
  • Is that really worth it?

    taking a file:
    index.html - 91565 bytes

    code:
    >>> f = open('index.short', 'w')
    >>> f.write( short( file( 'index.html' ).read() ).encode('utf-8') )
    >>> f.close()

    produces:
    index.short - 69915

    and my favorite - gzip:
    index.gz - 14709
    index.short.gz - 13319

    examples from zena.centrum.cz where we have some nasty whitespace in our HTML...

    Knowing this, I would take mod_deflate in (insert your favorite web server) over any python space stripping anytime...
  • Does mod_deflate allow removing the extra whitespace? In your example you are correct, that it's not a huge savings for that individual request, and gzip I would highly recommend, but the CPU time is negligible.
  • I like it a lot. I have been wondering if there was an easy way to do this, and there is. I will definitely have use for this soon.
  • Subsume
    Pretty hot yet simple. Thanks Mistar Cramar.
  • miracle2k
    I've considering this before, but was always unsure about the CPU cost involved.
blog comments powered by Disqus