ResourceLoader/Architecture

This page goes in-depth on the features of ResourceLoader. Together they create the environment that is ResourceLoader, which is known to make developers, servers and users alike, happy!

Check Presentations for videos and slides about these features.

Modules
ResourceLoader works with a concept of modules. A module is a group of resources identified by a symbolic name. They can contain any of the following types of resources:
 * Scripts
 * Styles
 * Messages

Aside from that a module may have several properties:
 * Dependencies
 * Group

All in all, this makes it possible to get a module by just using its name (instead of listing out all the resources and/or dependencies etc.).

Modules are delivered to the client as a bundle with all their containing resources in a single request. More about this follows in the resource sections. The package is unboxed by the Client.

Wrapping
When one or more module packages are sent to the client, the script components are wrapped in a special closure. So they are not executed immediately when the browser parses the response. Instead the closure is passed to the ResourceLoader client. This allows it to control the order in which they execute independently from the order in which they arrive from the server, with regards to the dependency tree (which the client has in memory). See also the Client section, for more about the loading procedure and a walkthrough of a simple and more advanced example scenario.

Minification
All scripts are minified before being put in the package. For this we use the JavaScriptMinifier library. In case of a cache-miss, the minification is done on-the-fly on the web server. See also the Caching section for more about the performance of the packaging and the caching infrastructure around this.

Conditions
Scripts can be conditionally included in a module based on the environment of the requesting client. This keeps responses as small as possible by only including relevant components while also allowing progressive enhancement and conditional inclusion of features based on the environment.

Example:
 * A grammer parser that includes a script file based on the user language.
 * The "mediawiki.log" module is only accompanied with the framework in debug mode. It overwrites the empty "mw.log" function with one that passes it on to the native console (if there is one), or it builds one in the DOM for older browsers and inserts the messages in there.
 * The Vector skin in MediaWiki has a custom jQuery UI theme. When a jQuery UI module is loaded, depending on the environment it loads the default jquery-ui theme or the Vector theme.
 * jQuery UI Datepicker has regional definitions for 62 different languages. Only one of the regional definition files wil be included. Fallback is handled by MediaWiki's localization framework.

Embedding


In order to reduce the number of HTTP requests for images used in the interface, ResourceLoader makes use of Data URI embedding. When enabled, images will be automatically base64-encoded and embedded into the stylesheet in-place. While it will make the stylesheet larger, it improves performance by removing the overhead associated with requesting all those additional files. The actual server response (which contains the minified result of all concatenated stylesheets and all embedded images in a single request) uses gzip compression. This enables the response to function a bit like a "super sprite" (more on this below). Regardless of the expansion caused by base64 encoding, the ResourceLoader response is smaller in size than the sum of the individual binary files.

To trigger the embedding the " " annotation is used in a CSS comment. For example:

Another interesting point is that using this technique makes traditional sprites obsolete. The motivation of a sprite is good, it:
 * Reduces HTTP requests by combining multiple images into one.
 * Increases power of compression by providing more sample data in a single file.

But sprites come with a few caveats:
 * Maintenance (if an image needs to be updated, one has to regenerate the entire sprite, update the background positions, etc.)
 * Produces overly complex CSS (unrelated stylesheets would be referencing the same image with arbitrary background positions, it isn't obvious which image is being referred to, like " " vs. " )".
 * Imposes restrictions on usage of the image (no freedom in background-repeat, -size or -position as that would "leak" other parts of the sprite). Therefore in various layouts, images cannot be converted into a sprite.

These caveats aren't the end of the world (sprites are in wide use, clearly they do work). And many front-end systems use sprites, even in a semi-automated fashion (making some of the above caveats less painful). But, using the automated embedding technique: the above caveats don't apply and we get the best of both worlds: .. the advantages of sprites also hold up: ... and, in addition to that, it gets even more out of it:
 * No maintenance
 * Clean CSS
 * No restrictions
 * Reduced number of HTTP requests.
 * Increased power of compression
 * No download delay: Once the stylesheet is there, all the images are there as well.
 * When using an interactive state (e.g. ) browsers only start the download once the relevant state is active. When embedding the image, there will be no "blink" during the download, because the image is instantly available from the embedded data URI.
 * The number of requests is even smaller. The CSS and the images are now in the same request.
 * The Gzip compression rate will be even higher. Completely unrelated modules can be combined and compressed together, without having to worry about anything - a "super sprite", if you will. Also, PNG headers will be compressed even stronger.

Remapping
For images that are not embedded (images that exceed the embed limit or for browsers that don't support embedding), ResourceLoader performs a remapping process. When the stylesheets are combined in the loader response, the image urls loose their relative file path reference. Remapping auto-corrects these urls to be relative to the loader location instead.

Flipping

 * For more information about directionality support in MediaWiki, see Directionality support.



With the Flipping functionality it is no longer necessary to manually maintain a copy of the stylesheet for right-to-left languages. Instead ResourceLoader automatically changes the direction of direction-dependent CSS rules (and more). Internally, ResourceLoader uses CSSJanus to accomplish this. ResourceLoader comes with a PHP port of CSSJanus (a Python library maintained by Google Inc.). This library provides that smart "flipping" logic.

Aside from flipping direction-oriented values, it also converts property-names and shorthand values. Swapping direction specific iconography is automated when using filenames ending in  and    respectively.

Consider the following example:

When put in a file that is loaded by ResourceLoader, without any additional changes or configuration, it will be automatically turned into the following for users having their interface in a right-to-left language:

Sometimes you may wish to exclude a rule from being flipped. For that the  annotation is provided. This instructs CSSJanus to not flip the following CSS declaration, or when used in the selector part, the entire following CSS ruleset.

For example:

Output will be:

Bundling
As mentioned, all resources are bundled in a single package. The loader response from the server contains both the scripts and the styles from the requested module(s) in the same request. The Client receives this and loads the stylesheet in the DOM at the right time, so they are in memory when the relevant scripts that use these CSS classes, execute.

Minification
All stylesheets are minified before being put in the package. For this we created the CSSMin library.

Conditions

 * See the Conditions section under Scripts for more information

Similar to scripts, the styles property also features the ability to compose the module dynamically based on the environment.

Resource: Messages
The messages are exported as a JSON blob, mapping the message keys to the correct translation. Messages are fetched on the server from MediaWiki's localization framework (including proper fallback to other languages). Only the message keys used in the module are included.

Bundling
Again, all resources are bundled in the same request. The Client introduces these messages into the client-side localization system when it executes the module.

Conditions
As with the other two resource types, the messages component is also optimized to load only what is necessary for the requesting environment. This is especially important considering that MediaWiki is localized in 300+ languages. Only 1 unique set of messages is delivered to the client.

Front-end
So how does all this play out in the front-end? Lets walkthrough a typical page view in MediaWiki, focussing on the ResourceLoader Client.

Startup Module


The startup module is the first and only hardlinked script being loaded on every page from  tag. It is a lightweight module that takes does 3 things:


 * Sanity check It starts by performing a quick sanity check that bails out if the current browser is not supported. This saves bandwidth as well as potentially broken interface, basically leaving the user with an untouched page with the natural non-javascript fallback behavior. Browsers such as Internet Explorer 5 and early versions of Mozilla fall in this category. For those the startup module is the first and last script to be loaded.
 * Module manifest It exports the module manifest. Containing all the dependency information of all modules, cache group (if any) and the last-modified timestamp of each module.
 * Configuration A subset of the server-side configuration of MediaWiki is made available to client-side scripts.

After that it makes a loader request for jQuery + the main ResourceLoader Client.

Client
The ResourceLoader Client is loader software written in JavaScript that is instantiated given the module manifest (from the startup module) and from there everything continues.

Loader
The loader can be give a list of module names to load. It automatically handles dependency resolution using the internal dependency map. Then it takes the list of modules that still need to be loaded and requests them in a batch from a server. The loading process is fully asynchronous.

Execution

 * This section is incomplete


 * Execution separated from loading/parsing
 * Direct or delayed execution as appropriate based on module dependencies
 * Insert messages and styles into memory before script execution

Back-end

 * This section is incomplete

Environment
Modules are composed based on the following environmental variables:
 * mode (debug or production)
 * skin
 * user language

The result of the generated module package is cached (see also Caching).

The load responder supports various methods of encouraging client-side caching, such as responding with an empty  to   headers if possible.

Response
GET /load.php?modules=foo|bar|quux&lang=en&skin=vector

Debug mode

 * This section is incomplete

To make development easier, there is a debug mode in ResourceLoader.

Differences:
 * Script resources: No longer minified, concatenated and loaded from load.php. Instead load.php instructs the client loader to request the files directly. As a result they are also executed in the global scope (since browsers execute scripts like that by default when not otherwise modified).
 * Style resources: No longer minified, concatenated and loaded from load.php. Instead load.php instructs the client loader to request the files directly. As a result data URI embedding and RTL-flipping no longer applies, because it will load the raw file as-is from disk.
 * Internal logging messages will be visible to all users using Internet Explorer (on Chrome or Firefox they go to the normally invisible javascript console instead).

Toggle mode
The mode can be toggled in several ways. In order of precedence:


 * Query parameter  (string): Set to "true" to enable, to "false" to disable. When absent, falls back to next step.  http://example.org/wiki/Main_Page?debug=true
 * Cookie  (string): Set to "true" or "false". When absent, falls back to next step.
 * (boolean): The default mode is determined by this configuration setting. Unless overridden in LocalSettings, this will be set to . A production wiki should never set this to true as debug mode will then be served to everybody. Thus being inefficient and likely introducing bugs due to the nature of debug mode.

Balance

 * This section is incomplete


 * Batching
 * Groups
 * Alphabetical order
 * Module timestamps in url (to allow static reverse-proxy caching)

On-demand package generation
ResourceLoader features on-demand generation of the module packages. The on-demand generation is very important in MediaWiki because cache invalidation can come from many places. Here's a few examples:
 * Core
 * Extensions Core and extensions generally only change when a wiki is upgraded. But especially on large sites such as Wikipedia, deployments happen many times a day (even updates to core).
 * Users Wiki users granted certain user rights (administrators by default) have the ability to modify the "site" module (which is empty by default and will be loaded for everybody when non-empty). This is all without servers-side access, these scripts/styles are stored as wiki pages in the database. On top of that, each user also has its own module space that is only loaded for that user.
 * Translators The interface messages are shipped with MediaWiki core and are generally considered part of core (and naturally update when upgrading/deploying core). However wikis can customize their interface by using the MediaWiki message namespace to modify interface messages (or create new ones to use in their own modules).

Cache invalidation
Every module has an automatically generated Last-Modified timestamp. This timestamp is based on a number of factors. All together, used to ensure proper cache busting when needed while also avoiding unnecessary re-generation of the package when it isn't needed.
 * JavaScript files and CSS files Last modified timestamps from the files on disk.
 * Files referenced in CSS Extracted from the stylesheets (e.g., see also Remapping). Last modified timestamps from the files on disk.
 * Localization Last time any of the messages has been updated in the localization storage.
 * Dependencies It works recursively for any module that this module depends on.

Because of all these different origins and cache invalidation factors, it is not desirable to have to manually (or scheduled) perform a "build" of some kind that would generate a huge package with everything in it.
 * Not only would it waste resources re-building many modules that haven't changed,
 * it would also be impractical to have one big build with everything in it because there are over 350 modules on an average Wikipedia site. Many of these come from extensions and wiki-users and these modules are likely only needed/loaded on certain pages or in certain states of the interface. For more on that see client side loader.
 * And it would not make efficient use of browser cache, since it would have to invalidate the entire "build" if even the slightest change is made (the solution that ResourceLoader has for this is explained under Balance).

Whenever a module is requested from the server, the above is evaluated. If needed, the cache will be re-generated. All phases of the packaging process have been optimized to be able to run on-demand on the web server. No build scripts, no periodic cron tasks.

Conclusion


In conclusion we'd like to think of ResourceLoader as creating a development environment that is optimized for:


 * Happy developers Easy to work with modules without worrying about optimization, maintenance, building and what not.
 * Happy servers The application itself is lightweight, scales well and is optimized to run on-demand.
 * Happy users Faster pages!

JavaScriptMinifier
Although the re-generation of a module package should be relatively rare (since cache is very well controlled), when it does happen it has to perform well from a web server.

For that reason it doesn't use the famous JSMin.php library (based on Douglas Crockford's JSMin) because it is too slow to run on-demand during a request response. Although JSMin.php only takes about 1 second for (which is okay if you're on the command-line), when working on-demand in a web server response (with potentially dozens of large files needing to be minified) waiting this long is unacceptable. Especially if potentially thousands of requests could come in at the same time, all finding out that the cache isn't up to date (to avoid a cache stampede).

Instead ResourceLoader uses an implementation of Paul Copperman's JavaScriptMinifier, which is up to 4X faster than JSMin. In addition to the speed, time has told that JavaScriptMinifier interprets the JavaScript syntax more correctly and succeeds in situations where JSMin outputs invalid JavaScript. The output size of JavaScriptMinifier is slightly larger than JSMin (about 0.5%; based on a comparison by minifying jquery.js, for which the difference was 0.8KB). The reason this is not considered a loss is because it is put in the bigger picture. ResourceLoader doesn't aim to compress as small as can be no matter the cost. Instead it aims for balance, get large gains in a wide range of areas while also featuring instant cache invalidation, fast module generation, a transparent "build"-free environment for the developer, etc. The fact that it could be a little bit smaller then becomes an acceptable trade off.

CSSMin
Features:
 * Minification
 * Remapping
 * Data URI Embedding