1

Dec

Filed in Code, Django, Python |

One of the optimizations (if you want to call it that) that can be done for decreased load-time on a web page, is removing excess white space. In many of our pages at iBegin this saved as much as 100kb. While not every site has some of the 300kb pages that we have, it can still add up very quickly.

Django by default provides a {% spaceless %} tag in your templates which will allow you to achieve this effect, but we don’t use Django’s template engine (Hooray, Jinja!). The tag approach also seemed fairly inappropriate, as we just want to do it across the entire site, no matter what. Instead, we moved it into a middleware, which simply strips all whitespace (using the same method as the built-in tag) from any HTML page.

While this makes some pages unreadable, the time it saves downloading the page for the average user (See Steve Souders’ tips) is well worth the trouble it causes a few people.

# Aliasing it for the sake of page size.
from django.utils.html import strip_spaces_between_tags as short
 
class SpacelessMiddleware(object):
    def process_response(self, request, response):
        if 'text/html' in response['Content-Type']:
            response.content = short(response.content)
        return response

Note, that if you use page caching middleware, placing this in the appropriate place will also allow you to save space in your cache.

Enjoy!

  • miracle2k

    I've considering this before, but was always unsure about the CPU cost involved.

  • http://yeago.net/ Yeago

    Pretty hot yet simple. Thanks Mistar Cramar.

  • http://sammcd.com Sam McDonald

    I like it a lot. I have been wondering if there was an easy way to do this, and there is. I will definitely have use for this soon.

  • http://zena.centrum.cz/ Honza

    Is that really worth it?

    taking a file:
    index.html – 91565 bytes

    code:
    >>> f = open('index.short', 'w')
    >>> f.write( short( file( 'index.html' ).read() ).encode('utf-8') )
    >>> f.close()

    produces:
    index.short – 69915

    and my favorite – gzip:
    index.gz – 14709
    index.short.gz – 13319

    examples from zena.centrum.cz where we have some nasty whitespace in our HTML…

    Knowing this, I would take mod_deflate in (insert your favorite web server) over any python space stripping anytime…

  • Dan

    I've considered this before, but wondered if there was much benefit over gzip (which I would recommend using even with whitespace stripping). Does a whitespace-removed, gzipped response save enough space to justify the (admittedly reasonable) CPU cost of whitespace-removal?

  • http://www.codekoala.com/ wheaties

    Very interesting thoughts. I'll have to play around with it sometime. Thanks!

  • http://www.davidcramer.net David Cramer

    Does mod_deflate allow removing the extra whitespace? In your example you are correct, that it's not a huge savings for that individual request, and gzip I would highly recommend, but the CPU time is negligible.

  • http://www.manifestdensity.net Tom

    So… every third byte was whitespace? That's a bit hard to believe, even as a worst-case. Still, if you saw improvements, then my congratulations.

    I would suggest, though (as I see others have), that people facing the same problem have a look at simply gzipping their text-based output at the webserver level. Since the action's probably happening on the webhead either way, it's unlikely to be much more costly in CPU terms, and it's even simpler to implement than your solution (it typically just takes a line or two in .htaccess if you're using Apache). Gzipping will certainly remove any overhead that whitespace represents.

  • http://www.davidcramer.net David Cramer

    It's mostly indentation that actually contributed to the whitespace :)

  • Andreas

    Clever trick! Debuggers could always use firebug if they want it indented nice. Everyone should do this + gzip + expires header if plausible

  • http://mikkel.hoegh.org Mikkel Høgh

    Note that Content-Type will usually be 'text/html; charset=utf-8', not just 'text/html' (since Content-Type without a charset is unsafe).

  • http://www.mahner.org/ Martin

    The drawback of this is, that it also steals the whitespace in pre and textarea tags.

  • http://www.davidcramer.net David Cramer

    Have you confirmed this? I believe the whitespace tag is designed to avoid that, but I haven't fully tested it.

  • http://www.mahner.org/ Martin

    Ups, I wasn't aware that there is a real “strip_spaces_between_tags” function. Sorry for the noise :D

  • http://www.mahner.org/ Martin

    I had problems with highlighted (pygments) text in pre-text which results in unindented code *cry*. But that's a very special condition.

  • Roger

    The appropriate place for cached content to be effected would be at the start of the end? :-/

  • Roger

    or the end*

  • http://www.peterbe.com Peter Bengtsson

    I wrote this snippet:
    http://www.djangosnippets.org/snippets/1055/
    to (very) efficiently whitespace optimize inline CSS which might be interesting for you and your Jinja too.

  • truebosko

    Very nice and simple trick. Thanks

  • http://www.covertprestige.net Florent V.

    Hello,

    I'm curious: why outputting whitespace-less HTML when you could just use gzip compression on the server for HTML, XML, CSS and JavaScript, and get better results (like from 300 kB — not kb ;) — to, say, 120 kB)? Did you do some performance testing regarding that?

    Unless gzipping your text output has a clear performance cost, I would keep my HTML code with whitespace (if not perfect, then at least decently readable code). Makes it easier to debug when you need to check the real thing (Firebug only shows what Firefox understands, not what it gets from the server).

    Do you happen to use this method, then gzip (through mod_deflate for instance)? If so, how much do you save compared to a gzipped version with all whitespace intact?

  • Martin

    solving a non-issue with a sub-optimal technique and posting a blag about it… this feels like the type of publicity the rails community is so famous for…

  • http://www.davidcramer.net David Cramer

    Uneducated trolls.. sound like something that every community is famous for.

  • http://www.davidcramer.net David Cramer

    We GZIP as well. Sadly, everyone's still not on broadband today, so even shaving off 5 or 10k from the request can be quite useful (especially when the amount of time it takes to do that is immeasurable).

  • http://ckelly.net/ Chris Kelly

    The whitespace tag is really simple, it just finds the end of one tag and the beginning of another with whitespace in between, and removes said whitespace. It looks like it doesn't have any special cases, so it unfortunately affects textarea and pre tag content.

    see: http://code.djangoproject.com/browser/django/tr...

  • http://ckelly.net/ Chris Kelly

    The whitespace tag is really simple, it just finds the end of one tag and the beginning of another with whitespace in between, and removes said whitespace. It looks like it doesn't have any special cases, so it unfortunately affects textarea and pre tag content.

    see: http://code.djangoproject.com/browser/django/tr...

  • http://www.dougalmatthews.com Dougal Matthews

    Shame it messes up the

     and  tags :(  was gonna use it otherwise.
    
    Intersting idea though!
  • http://www.dougalmatthews.com Dougal Matthews

    Shame it messes up the pre and textarea tags :( was gonna use it otherwise.

    Intersting idea though!

  • Martyn Clement

    I suggest this tag should be effective only on setting debug=False
    Debugging html on viewing html source code is quite common.
    That’s what I did ;)

  • lericson

    There are bigger fish to fry.

  • http://opeaixy.com/qsqaxa/5.html John1297

    Very nice site! cheap cialis http://opeaixy.com/qsqaxa/4.html

  • http://tutoriais.ctdo.com.br/tutoriais/linguagens-para-software/django/django-obtendo-mais-performance-com-html-spaceless.html Django – Obtendo mais performance com HTML Spaceless :: Tutoriais CTDO – Sua Base de Tutoriais Online
  • http://professional-suggestion.com/ Uninstall Program

    Thanks for your code!

  • http://www.medyumburak.com medyum

    There are bigger fish to fry.

blog comments powered by Disqus