Wikimedia Performance Team/Page load performance

This document describes which aspects of a web page impact page load performance, and how changes can influence those aspects. This was originally drafted in February 2016 (T127328) and was revised in October 2018.

Metrics
We primarily focus on the following three metrics:


 * Visual rendering: Most of the page should render to completion as quickly as possible. Observed through the Paint Timing API (first-contentful-paint).
 * Total page load time: The load indicator in web browsers. This waits for the download and rendering of HTML document and the download and initial execution of all sub resources (JavaScript, CSS, images). Observed through the Navigation Timing API (loadEventEnd).
 * Responsiveness: The page should be able to respond to user interaction at all times. There should not be background or periodic code execution that causes lag or freezing of the main thread for prolonged periods of time. Observation through the Long Tasks API.

Principles
Performance principles, in order of their importance:


 * 1) User. (Perceived performance and overall user experience. This includes backend latency response times.)
 * 2) Developer. (Developer productivity; ease of learning, maintaining, and debugging.)
 * 3) Server efficiency. (Such as disk space, memory usage, CPU load, number of servers, etc.)

Shipping frontend assets
Deliver CSS and JavaScript fast (bundled, minified, and avoiding duplication) while retaining the benefits of caching. This is all done for you by ResourceLoader.

ResourceLoader is the delivery system for bundling and loading CSS/JavaScript files in MediaWiki.


 * Learn how to register module bundles with ResourceLoader.
 * Learn how it works (high-level architecture).

General principles
When loading a module, it must not affect the initial rendering of the page, particularly "above the fold" (the top portion of the page that's initially visible on the user's screen). Load little to no JavaScript code upfront. Make the most of page and user metadata on the server-side through conditional loading. Anticipate whether or not a specific user on a specific page, needs a given module to do something. See loading modules for more information.

Users should have a smooth experience; interface components should render progressively. Preserve positioning of elements (e.g. avoid pushing down content in a reflow).

Familiarise yourself with Compatibility policy and the Architecture Principles. Our architecture is modelled after the web itself. Every page load starts in "Basic" mode, where only the HTML is rendered. Assistive technology needs to understand the information and structure based on the semantics alone. CSS can be assumed to succeed for visual design and should be used for presentation and to convey visual meaning only (though keep in mind that stylesheets degrade well in Grade C browsers). We aim to be fairly aggressive in raising Grade A requirements to modern browsers, which reduces maintenance cost and payload overhead. This aim is only achievable when components start out with a solid and functional base experience, with server-rendered access to information, and traditional request-response cycles for contributing to the wiki.

Embrace that every page load starts in Basic, and that the Grade A JavaScript layer is an optional one that may or may not arrive, depending on numerous factors including device capability, network stability (JS times out or not arriving before user leaves page), browser choice, interference from browser extensions, and user preferences (e.g. noscript). As well as offline re-use of our content such as Kiwix, Archive.org, archive.today, and IPFS; and external re-use without skin or scripts, such as Safari Reader mode, Tor browser, Apple Dictionary/Lookup, Apple Siri, and third-party mobile apps for Wikipedia. If you render or visualise information client-side only, it is de-facto inaccessible to these environments (as learned in 2018 from the Graph extension). Also consider how information and graphs are crawled by search engines. If it doesn't have a URL, it's probably not crawlable, searchable, or sharable.

HTTP caching
Improving the cachability of responses to web requests used in the critical path, is expected to have the following impact:


 * First views: None.
 * Repeat views:
 * For any resource:
 * Consume less bandwidth (reduces mobile data costs).
 * Consume less power (fewer cell-radio activations required).
 * For CSS files:
 * Reduce time for visual rendering. Cached stylesheets load faster without a network roundtrip, allowing rendering to start and/or complete sooner.
 * Reduce time to  and   metrics. Stylesheets are subresources required for DOM completion.
 * For JavaScript files:
 * Reduce "Time to Interactive". Cached scripts take less time to load, parse, and compile. Browser may store the compiled bytecode from previous page views. As of writing (Dec 2019), the bottleneck in loading JavaScript code is often not the download or execution, but the parsing/compilation. Allowing this to be cached or reducing the amount of code can benefit the user more than optimising how fast it can execute.

Latency
Reduce the time it takes for the browser to receive the response to a request it makes.

How:


 * 1) Making the response complete sooner. For example:
 * 2) *… by improving backend response times on the server,
 * 3) *… by allowing the response to be cached by the CDN in a nearby datacenter,
 * 4) *… by reducing the size of the response.
 * 5) Making the request start earlier. For example:
 * 6) *… by making requests in parallel instead of serially one-after-the-other.
 * 7) **If same code is in charge of multiple requests, the   function can be used to make parallel requests.
 * 8) **If multiple pieces of code are in charge of their own requests, consider returning a Promise and letting the other code proceed immediately to make its own request. Then, only once you truly need the data from the first Promise (or to wait for its response) and call its  method. You can also use   to return a Promise that auto-resolves when multiple other Promises have settled.
 * 9) *… by hinting the browser directly about your intentions. This has the benefit of not needing any changes to your code! By the time your HTML, CSS, or JS, makes a related request, the result of these hints will automatically be used.
 * 10) ** (MDN, web.dev)
 * 11) ** (web.dev 1, web.dev 2, andydavies)
 * 12) ** (MDN, web.dev 1, web.dev 2) – Beware of When not to preload!
 * 13) *... by moving the request invisible with . If you let the CDN and browsers cache something for 24 hours, this means that the first request after 24 hours will be delaying the client whilst we wait for this cache miss response. By setting stale-while-revalidate you can allow the browser to make use of the cache one more last time, and meanwhile in the background the browser will fetch the new value to use next time. For example, you could allow one stale response for upto 7 days. Or if it must be within 24 hours, you could shorten the regular cache period to 12 hours and then allow 12 hour of stale responses, adding up to 24h.
 * 14) **web.dev stale-while-revalidate
 * 15) **MDN Cache-Control#stale-while-revalidate

Impact on first views and repeat views:


 * Reduce time overall (for the feature to load, or for the action to be performed).
 * Consume less power (making more optimum use of the network means less idle time and thus fewer cell-radio activations required).

Preloading
Avoid using  links or headers on the main HTML navigation response as this can cause congestion and competition against the HTML resource itself, as well as the critical CSS resources needed for initial rendering. In MediaWiki these resources are already linked from the  and discovered and pre-fetched as soon as possible, by the browser's lookahead parser.

To start the download of late-discovered resources earlier than usual, consider preloading closer to the time they are needed to avoid this competition.

How to:


 * Dynamically create an element like  in JavaScript, based on a user interaction toward the feature/resource.
 * Emit  headers on a resource that is between the HTML and the resource in question.

For example, in MediaWiki we generally display the logo as a CSS background image on a particular  element. The image is specified by a stylesheet. When the browser parses and applies this stylesheet, it ignores the url because the CSS rule in question does not apply yet. (Rules are only applied if the rule matches the state of an element in the DOM, and when the browser first renders an article it typically has not yet reached the HTML of the sidebar).

The "natural" point where the browser will start the download for the logo file is when that sidebar HTML has been reached and rendered to the user (first without logo). In order to optimise this, the skin stylesheet emits a  header that hints the browser to start the download immediately because we know we are going to need it soon, even if the brower hasn't reached that part of the HTML stream yet.

Size of HTML payload
The first 14 KB ( per TCP slow-start ), should ideally contain everything needed to render the skin layout and a little bit of the article text. And, it should allow the browser to render that initial layout in a way that isn't later moved around or otherwise invalidated (additional components may appear later, but existing components should not move).

Examples of how to improve this:


 * Reduce amount of per-page header bloat, e.g.  tags, RLQ, mw.config, mw.loader.
 * Better minification for inline CSS, inline JavaScript and HTML within the.
 * Arranging the HTML to ensure layout and start of content render first.

See also T231168 for on-going work in this area.

Size of stylesheets
Reducing the size of the main (blocking) stylesheet loaded by the HTML, is expected to have the following impact:


 * All views (first view, and repeat views):
 * Improvement of all paint metrics. Smaller stylesheets load and parse faster, allowing rendering to start and complete sooner.
 * Reduce time to  and   metrics. Stylesheets are subresources required for DOM completion.
 * Consume less bandwidth (reduces mobile data costs).
 * Consume less power (reduce CPU time for stylesheet parsing, especially data URIs. See T121730 for on-going work).

Size of scripts
Reducing the size of scripts is expected to impact all views (first view, and repeat views): (The above does not apply to scripts that are lazy-loaded from a user interaction after document-ready.)
 * Reduce "Time to Interactive". Less code to download, parse, and execute.
 * Reduce time to  and   metrics. These scripts are sub resources part of DOM completion.
 * Consume less bandwidth (reduces mobile data costs).

Scripts run in one of three overall phases. Each phases depends on the outcome of previous phases (they run serially). Earlier phases have a bigger impact when reduced in cost, compared to later phases, because they allow subsequent phases to start their work sooner.
 * 1) Inline scripts in HTML   (including page configuration, page module queue, and the async request to the Startup module).
 * 2) The Startup manifest (including module manifest, dependency tree, and the mw.loader client).
 * 3) The page modules (the source code of modules loaded on the current page, and their dependencies).

Size of startup manifest
Reduce the amount of code contained in the Startup module by keeping the number of distinct module bundles low. In general, additional scripts should be added to an existing module bundle instead of creating new ones (see blog post and Grafana).

To help quantify the cost of modules, and to help with code maintenance more generally, we organize frontend assets in subdirectories by module bundle. See also code conventions, Best practices for extensions, and T193826.

Processing cost of scripts
Reduce the amount of time spend in executing JavaScript code during the critical path. This includes:


 * Deferring work that does not need to happen before rendering.


 * Splitting up work in smaller idle-time chunks to avoid blocking the main thread for too long (non-interactive "jank").


 * Re-arranging code so that there are no style reads after style writes in the same event loop. The browser naturally alternates between a cycle of JS execution and a cycle of style computation and rendering. If styles are changed in the main JS cycle and then read back within that same cycle, the browser is forced to pause the script and perform an ad-hoc render cycle before being able to resume the script. See also:
 * wilsonpage/fastdom (github.com)
 * Use requestAnimationFrame for visual changes (Web Fundamentals, Google Developers)
 * Avoid forced synchronous layouts (Web Fundamentals, Google Developers)

Expected impact on all views:


 * Reduce "Time to Interactive". Less execution before the page is ready. Less uninterrupted execution which blocks interactions Fewer "forced synchronous layouts" which significantly slow down code execution.
 * Reduce time to  metrics. These finishing of initial scripts' execution holds back "loadEvent".