Reading/Web/Projects/Performance/Stripping references from page in beta cluster

Hypothesis
Certain content doesn't necessarily need to be shipped to the user upfront and sometimes not at all. A good example is the list of references. If a mobile user never clicks on a superscript reference link or loads the references section, then they do not make use of the HTML required to generate. We can thus remove this HTML from the initial page load and lazy load it if and when needed.

Prediction
Previous experiments had shown on the Barack Obama that removal of references had a significant impact on the fully loaded time at a small increase to TTFB. First render was unlikely to be impacted by such a change.

Method
MobileFrontend has a library called MobileFormatter which extends the HtmlFormatter in core. We used this to strip any elements in the HTML with the class references. On various good quality articles the size of this HTML is significant, e.g it accounts for 50% of all HTML in the Barack Obama article.

Due to a performance related change then went out the same day, which stripped srcset attributes from image tags in the page, we had to establish a new baseline. The configuration on the beta cluster was first updated to remove references. Later the change was reverted to retain the references list.

A script was used to calculate the median and average of values before the revert during a 5 day period and after the revert during the same specified period of time for a specified article (Barack Obama) on an emulated 2G connection.

The commands used to measure the impact of the change were:

Results
The re-addition of the HTML for references seemed to improve performance. Note, that given the change we are measuring is the re-addition of references, a negative percentage decrease is a positive results. Fully loaded time was better without references as you might expect but TTFB and render time were not impacted. Savings in bytes were high. The impact in beta was much more noticeable but followed the same trend. During the experiment the best time for fully loaded time we saw in stable was 18.91s.

Analysis
The improvements in fully loaded time were rather small but still for users that do not view references at all, they provide a big impact in savings of bytes.

Conclusions
Removing references from the HTML has unmistakeable bytes savings but does not look like it will improve time to first render.

The impact on fully loaded time is positive, although not as large gets us closer to the 15 second mark for fully loading Barack Obama and other large pages.

Beta even when run alongside stable does not seem to show a correlation with stable with regards to fully load time.

Next steps
We should aim to lazy load references from stable.

First things first we need to verify this does not impact anything else in the cluster given the additional storage it requires.