Reading/Web/Projects/Performance/Removal of secondary content in production

Hypothesis
Certain content doesn't necessarily need to be shipped to the user upfront and sometimes not at all. A good example is the navbox content (this content also is not optimised for mobile but that's a secondary concern and out of scope for this test). We can remove this HTML from the initial page load and lazy load it if and when needed. Before introducing the necessary APIs for lazy loading such content, we wanted to gauge how impactful the removal was.

Despite previous experiments showing that this made little impact on performance, it is unclear whether this reflects the global audience. Currently our 2G tests run from Dulles (Washington, East coast USA) which is much closer to our data centers then for example a country like Indonesia. It is thus not clear whether the webpagetest data we are collecting is a good indication for our global traffic. To understand whether reducing HTML size can make any impact on performance we'd need to view global traffic, specifically the navigation timing reports we collect from real end users.

Prediction
Based on previous experiments, removing navboxes for a quality page such as Barack Obama should There is potential for:
 * drop the number of bytes we ship to users
 * make little to no difference to the fully load time
 * increase the time to first byte (TTFB) from a clear cache due to the time needed for the MobileFormatter to transform the parser output
 * no different to first render
 * Impact on global total page load time
 * Impact on bytes out in the cluster summary for text cache eqiad
 * Impact on global page views traffic due to more engaged visitors

Method
A config change was made to strip navboxes and content not designed for display on mobile (the nomobile class) on Wednesday 16th March around 00:11 PST (week 11 of the year ).

A period of waiting time was left to account for cached pages being updated to respect the new setting and allow data to be collected.

Given the 30 day cache on Wikipedia, it was possible that results would not be visible until at least a month had passed. With that in mind, these results are live and will update as more data becomes available.

Using the webpage test reporter tool we were able to quickly get an idea of the impact on fully loaded and first render time was observed on the Barack Obama page for the 13 days prior to the change and the 11 days after the change using data in Graphite.

We looked at the 95th percentile of global total page load time before and after the change for anonymous users using the command:

We didn't look at beta, given that other experiments were running there that would impact results.

To be more confident of the data we were seeing we also analyzed the raw data in the NavigationTiming tables as collected by EventLogging.

The impact on the bytes sent to users was monitored but given the graphs contain data from both desktop and mobile and desktop traffic accounts for 50% of our page views, it was expected that it would be difficult to get a sense of any impact there.

Analysis
As expected, after 17 days of analysis a positive impact could be seen on fully loaded time on both the Barack Obama article and global traffic, but it was not substantial. That said, the upper value of fully loaded time dropped considerably giving indication that there is traffic on connections far slower than our simulated 2G connection that are hopefully benefiting from this change.

The impact on bytes was clear to see on the Barack Obama article but we were unable to get any sense of impact on the global text cache eqiad.

No unusual spikes in page view traffic were witnessed which would be expected given the low impact on fully load time on the 95th percentile.