User:Brooke Vibber/Wikipedia data size test

From mediawiki.org

Just for kicks, I cleared my cookies & caches and loaded up en:Frog fresh to see what the breakdown of network bandwidth would look like...

Totals[edit]

645,947 bytes of data content are transferred, not counting any HTTP headers:

  • 72.5% content images
  • 10% JavaScript code
  • 7.5% style sheets
  • 5.5% HTML web page <- the important stuff
  • 4.5% UI images

Connections must be made to three distinct hosts (en.wikipedia.org, meta.wikimedia.org, and upload.wikimedia.org).

This would take about 90 seconds to download on a 56kbit connection. It's easy to forget what low-bandwidth feels like for those of us with broadband, but people outside cities may not have good broadband, and mobile devices are often stuck on pretty slow networks too. Compare regular Wikipedia against our mobile gateway on your mobile phone sometime; even a fancy browser like the iPhone's will feel like molasses trying to load the full site, while loading things up lickety-split from the more minimal mobile gateway.

Fairly simple compression improvements could save 128kb of that:

  • 64k by gzipping JS and CSS files that are currently served uncompressed
  • another 64k through smarter compression of thumbnails (animated GIF optimization, use of JPEG for some PNG thumbs)

That would save approximately 18 seconds of download time for our hypothetical low-bandwidth user.

HTML[edit]


CSS[edit]

48,168 bytes of CSS.

GZIP would save 26,715 bytes.


JS[edit]

66,274 bytes of JS.

GZIP would save 36,867 bytes.


Images[edit]

495,884 bytes of images, 28,959 bytes of which consist of global UI components (stylesheet icons, site logo).

Nearly half that total is made up of two inefficiently stored images: a poorly-optimized animated GIF, and a PNG of a scanned drawing.

The animated GIF can save about 20k with an upgraded ImageMagick, but is simply not a very efficient format. More could be saved by dropping information from the animation, so frames share more data.

The PNG is a grayscale scan of a textbook drawing, but is saved as an RGB PNG. Resaving as grayscale saves 4kb in the thumbnail; JPEG saves 44kb.

More efficient compresssion of top two images could save about 64kb.