Reading/Web/Projects/Performance/Lazy loading images

Hypothesis
Articles contain lots of images which when loaded can lead to multiple HTTP connections which compete against one another, slowing down the downloading of CSS and HTML. Many of the images loaded are never shown to the user. If we lazy load these images on demand when needed we will cut the size in bytes of the page served to the end user and speed up the downloading and rendering of content.

Progress
Lazy loading images was enabled on the production beta channel on the 24th March. As an example, this led to the overall bytes served to users for images on the Barack Obama article dropping from 657kb to 169kb.

On May 9th 2016 (4pm PST) we rolled out lazy loading of images to all users of Bengali Wikipedia. The purpose of this action was to collect data around page views, global NavigationTiming results from a project to guide a future roll out.

On June 23rd 2016 (4pm PST) we deployed lazy loading of images to all users of Farsi and Ukrainian Wikipedias (two medium sized wikis) with the hope of benefiting from the larger audience of those sites and to give more opportunities for feedback.

Predicting impact on stable with static files
Static files emulating the lazy loaded images experience vs the current experience were uploaded to the cluster.

First view load was 38.541s for the current experience of the Barack Obama page on a 2G browser. With images lazily loaded under the same conditions first view load was 25.210s - a saving of around 13s and a 35% speed improvement.

Interestingly, our own webpagetest jobs simulating a 2G connection record fully loaded time as already being around the 25s mark. When run here the first load time was 22.787s for the current experience and 18.942s with lazy loaded images - a saving of around 4s - a 17% speed improvement.

The speeds simulated are supposed to be the same, so it is currently not clear which is a more accurate indication of how this change will benefit 2G users and this is being investigated. When you look at the 95th percentile for fully loaded time per day across the site in the last month it ranges from 11.5s to 15s. This is somewhere between a 3G Fast and 3G Slow connection. When pushing to stable if our visitor speed profile does not change we would hope to see it range between 7s and 11s. However, it is possible given the large difference in 2G before and after we might see it double if it leads to us obtaining more samples from users on slower profiles. When you look at the 95th percentile for first paint per day across the site in the last month it ranges from 4.4 and 5.8s. Again this seems to match a 3G profile. When pushing to production we'd hope this would shift to either 3s or leap to 13s if new samples are collected.

Lazy loading images in production beta
There was a small positive impact on the 95th percentile of beta anonymous and authenticated users in our global metrics.

Impact was less clear in our controlled tests on Barack Obama.

There are many problems with using beta to predict performance impact in production. It should not be used for this purpose:
 * Beta has a split cache compared to production so is generally not a reliable environment to predict performance changes - pages often load from an unpopulated cache.
 * In beta there is a banner experiment running which greatly increases bytes loaded by introducing an additional image at the top of the page. This does not run in production
 * The 95th percentile of beta users have a fully loaded time of 4s, in production it is closer to 11s. This suggests beta users are more likely to use a much faster connection than the types of connection we are targeting in production.

Lazy loaded images on Ukranian, Farsi, and Bengali mobile web Wikipedia
Rollout of lazy loaded images on fa.m.wikipedia.org and uk.m.wikipedia.org suggests a nontrivial speed improvement in page fully loaded time (initial lag excluded) and a significant reduction in image bytes shipped per pageview, leading to lighter weight pages.

Rollout of lazy loaded images on bn.m.wikipedia.org suggests a nontrivial speed improvement in page fully loaded time (initial lag excluded) on HTTP1, likely no speed improvement on HTTP2, and again a significant reduction in image bytes shipped per pageview, leading to lighter weight pages.

Note well that the x-axis in the speed graphs is presented in log2n increments.

To reduce noise, navigation timing events are filtered to ensure key fields are present, with an emphasis on anonymous, non-redirected, plain article pageviews.

To simplify analysis for bytes shipped calculations, eligible source HTML pages are constrained as to a relative path  (on language variant wikis this path would be different, but that's out of scope here) for a given wiki without a colon ":" character in the remainder of the path, with a restriction that responses must be HTTP 200s, in order to avoid overcounting of 30x, 40x, or other such spurious responses. This aids in narrowing down the analysis to requests likely to be plain article pageviews. Image bytes are constrained to those served from upload.wikimedia.org with an eligible Referer for the same restriction as page paths.

Changes introduced to support lazy loaded images required modified (increased) JavaScript/CSS/HTML.

ukwiki

Based on back-to-back weeks just before and just after lazy loaded image implementation, pages with lazy loaded images loaded faster at the 10th, 50th (median), and 90th percentiles on both HTTP1 and HTTP2.

Based on examination of two 3-day periods (3-5 May 2016 vs 28-30 May 2016), image bytes per pageview were reduced by about 44.82%. This contributed to a decrease of about 16.05% in bytes shipped for the modified JavaScript/CSS/HTML plus images as compared to the baseline JavaScript/CSS/HTML and images.

fawiki

Based on back-to-back weeks just before and just after lazy loaded image implementation, pages with lazy loaded images loaded faster at the 10th, 50th (median), and 90th percentiles on both HTTP1 and HTTP2.

Based on examination of two 3-day periods (3-5 May 2016 vs 28-30 May 2016), image bytes per pageview were reduced by about 39.55%. This contributed to a decrease of about 16.79% in bytes shipped for the modified JavaScript/CSS/HTML plus images as compared to the baseline JavaScript/CSS/HTML and images.

bnwiki

Navigation timing data for bnwiki were sparse, making analysis difficult.

Based on data from 5-11 May 2016, slightly before the initial lazy loaded images went into force, and 23-29 June 2016, the week corresponding to the latest Thursday to Wednesday week with the most up-to-date loading technique 9 (the same week used for the latter week for ukwiki and fkwiki in above analysis), pages with lazy loaded images loaded faster at the 10th, 50th (median), and 90th percentiles on HTTP1. But pages with non-lazy loaded images loaded faster at the 10th, 50th (median), and 90th percentiles on HTTP2.

A certain level of chaos in the events is evident in the line chart.

Comparison of 14-May-2016 - 11-May-2016 (prior to lazy loaded images) versus 2-29 June 2016 (well after lazy loaded images implemented) paints a slightly more complete picture, at the expense of more general time based trends potentially complicating the data.

Taking the data at face value, it again appears that HTTP1 lazy loaded pages loaded faster at the 10th, 50th (median), and 90th percentiles. On HTTP2 lazy loaded pages loaded faster at the 90th percentile, but slower at the 10th and 50th (median) percentiles.

Something interesting occurred with bnwiki. The relative amount of HTTP1 traffic was considerably greater (72.94% of share) on lazy loaded images than without them (61.13%) for the 4 week window comparison. This same trend was observed with comparison windows closer to the switchover of varying lengths. This suggests that perhaps lazy loaded images had a larger impact on the relatively slower connections for bnwiki (twice as slow at the median prior to the change), many originating from Bangladesh.

Data transfer comparisons were more straightforward. Based on examination of two 3-day periods (3-5 May 2016 vs 28-30 May 2016), image bytes per pageview were reduced by about 40.24%. This contributed to a decrease of about 22.73% in bytes shipped for the modified JavaScript/CSS/HTML plus images as compared to the baseline JavaScript/CSS/HTML and images.

Data Transfer

Queries

The following query was used to derive the lag-excluded load time, roughLoadTimeInitialLagExcluded.

The following query was used to derive image bytes transferred using the constraints described above.

The following query was used to derive page and JavaScript/CSS bytes (pre- and post-modification for lazy loading) transferred using the constraints described above.

The following query was used to derive pageviews for the using the constraints above. Practically all matching records were qualified as pageviews, largely ruling out the possibility of image byte transfer counts with proper Referer values being derived from anything other than qualified pageviews.

Caveats As with any data spanning time series and the myriad complexities involved with different devices and environments, data are subject to fluctuation. However, the data transfer savings are unambiguous, and the larger event sampling pool with ukwiki and fawiki lend a degree of confidence that pages are actually loading faster.

Next steps
The next step would be to deploy to Japanese Wikipedia.

The goal is to do an A/B test to more confidently show the impact of this change.