Object cache

MediaWiki uses caching in many components and at multiple layers. This page documents the various caches we use inside the MediaWiki PHP application.

General
The class provides interfaces for two kinds of caches:


 * 1) A place to store the result of a computation, or data fetched from an external source (for higher access speeds). This is a "cache" in the computer science definition.
 * 2) A place to store lightweight data not stored anywhere else. Also known as a stash (or "hoard" of objects). These are values that are not possible (or not allowed) to be recomputed on-demand.

Terminology
A cache key is said to be "verifiable" if the program is able to verify that the value is not outdated.

This applies when a key can only have one possible value, such as computing the 100th digit of Pi could be cached under key. The result can be safely stored in high-speed access stores without coordination, because it will never need to be updated or purged. If it expires from the cache, it can be re-computed and produce the same result. The same applies to storing the wikitext of a certain revision to a page. Revision 123 has happened and will always contain the same content. If the program knows the revision ID it is looking for, a cache key like  could also be a verifiable cache key.

Payload robustness
Cache interfaces in MediaWiki support not only scalar scalar values, but also (potentially nested) structures of arrays and plain (stdClass) objects. Many also support instances of arbitrary classes by relying on PHP's built in serialization mechanism, but relying on this mechanism is deprecated (see T161647). The reason for this is security and resilience: PHP serialization is extremely brittle against changes to class definitions, which can lead to incompatibility between data and code:

Any code writing to or reading from a cache has to consider forward- and backward-compatibility, since the code that wrote the data may not be the same version as the code reading the data. Typically, the code reading the data will have the same or a newer version (requiring backwards compatible code, or forwards compatible data), but this is not necessarily true: if an update of a production system is rolled back due to errors, we may end up with older code reading data that was stored by newer code (requiring backwards compatible data, or forwards compatible code).

In the future, caches may start supporting transparent JSON serialization of objects that implement the JsonUnserializable interface (introduced in 1.36). Until then, code that uses caches has to take care to represent the data in a robust way.

Interfaces
These are the generic stores used by the various logical purposes described in the Uses section. Most of them can be obtained from the  class. Backing storage is configured via the configuration settings.

Local server

 * Accessed through.
 * Configurable: No (automatically detected).

Values in this store are only kept in the local RAM of any given web server (typically using php-apcu). These are not replicated to the other servers or clusters, and have no update or purge coordination options.

If the web server does not have php-apcu (or equivalent) installed, this interface falls back to an empty placeholder where no keys are stored. It is also set to an empty interface for maintenance scripts and other command-line modes. MediaWiki supports APCu, and WinCache.

WAN cache

 * Accessed through.
 * Configurable: Yes, via $wgMainWANCache, which defaults to $wgMainCacheType.

Values in this store are stored centrally in the current data centre (typically using Memcached as backend). While values are not replicated to other clusters, "delete" and "purge" events for keys are broadcasted to other data centres for cache invalidation. See WANObjectCache class reference for how to use this.

In short: Compute and store values via the  method. To invalidate caches, use key purging (not by setting a key directly).

Local cluster

 * Accessed through.
 * Configurable: Yes, via $wgMainCacheType.

Mostly for internal use only, to offer limited coordination of actions within a given data centre. This uses the same storage backend as WAN cache, but under a different key namespace, and without any ability to broadcast purges to other data centres.

Main stash

 * Accessed through.
 * Configurable: Yes, via $wgMainStash.

Values in this store are stored centrally in the primary data centre, and later replicated to other data centres (typically using MySQL or Redis as backend). By default, keys will be read from a local replica and may be lagged. Master reads can be done using, but should not happen during GET requests.

This store is expected to have strong persistence and is often used for data that cannot be regenerated and is not stored elsewhere. However the data stored here must be non-critical and result in minimal user impact, thus allowing for the backend to sometimes be partially unavailable or wiped if under operational pressure without causing incidents.

Interwiki cache
See Interwiki cache for details, and also.

Parser cache
See Manual:ParserCache for details. See also purgeParserCache.php.
 * Accessed via the  class.
 * Backend configured by (typically MySQL).
 * Keys are canonical by page ID and populated when a page is parsed.
 * Revision ID is verified on retrieval.

Message cache

 * Access via.
 * Backend configurable by $wgMessageCacheType (defaults to $wgMainCacheType, with fallback to MySQL).

Revision text

 * Accessed via.
 * Stored in the WAN cache, using key class.
 * Keys are verifiable and values immutable. Cache is populated on demand.

Background
The main use case for caching revision text (as opposed to fetching directly from the text table or External Storage) is for handling cases where the text of many different pages is needed by a single web request.
 * Originally implemented in 2006 (, commit 376014e).
 * Process cache added in 2016.
 * Adopted by MessageCache in 2017.

This is primarily used by:


 * Parsing wikitext. When parsing a given wiki page, the Parser needs the source of the current page, but also recursively needs the source of all transcluded template pages (and Lua module pages). It is not unusual for a popular article to indirectly transclude over 300 such pages. The use of Memcached saves time when saving edits and rendering page views.
 * MessageCache. This is a wiki-specific layer on top of LocalisationCache, which consists primarily of message overrides from "MediaWiki:"-namespace pages on the given wiki. When building this blob, the source text of many different pages needs to be fetched. This is cached per-cluster in Memcached, and locally per-server (to reduce Memcached bandwidth ;, commit 6d82fa2).

Example
Key.

"content address" refers to the  on the wiki's main database (e.g. "tt:1123"). This in turn refers to the text table or (External Storage).

To reverse engineer which page/revision this relates to, Find  for the content address, then find the revision ID for that content slot ,

The revision ID can then be used on-wiki in a url like https://en.wikipedia.org/w/index.php?oldid=951705319, or you can look it up in the revision and page tables.

Revision meta data

 * Accessed via.
 * Stored in the WAN cache, using key class.
 * Keys are verifiable (by page and revision ID) and values immutable. Cache is populated on demand.

MessageBlobStore
Stores interface text used by ResourceLoader modules. It is similar to LocalisationCache, but includes the wiki-specific overrides. (LocalisationCache is wiki-agnostic). These overrides come from the database as wiki pages in the MediaWiki-namespace.


 * Accessed via.
 * Stored in the WAN cache, using key class.
 * Keys are verifiable (by ResourceLoader module name and hash of message keys). Values are mutable and expire after a week. Cache populated on demand.
 * All keys are purged when LocalisationCache is rebuild. When a user save a change to a MediaWiki-namespace page on the wiki, a subset of the keys are also purged.

Minification cache
ResourceLoader caches the minified versions of raw JavaScript and CSS input files.
 * Accessed via.
 * Stored locally on the server (APCu).
 * Keys are verifiable (deterministic value). No purge strategy needed. Cache populated on demand.

LESS compilation cache
ResourceLoader caches the meta data and parser output of LESS files it has compiled.


 * Accessed via.
 * Stored locally on the server (APCu).

File content hasher
ResourceLoader caches the checksum of any file directly or indirectly used by a module. When serving the startup manifest to users, it needs the hashes of many thousands of files. To reduce I/O overhead, it caches this content hash locally, keyed by path and mtime.


 * Accessed via.
 * Stored locally on the server (APCu).