Manuel:Cache des objets

From mediawiki.org
This page is a translated version of the page Manual:Object cache and the translation is 100% complete.

MediaWiki uses caching in many components and at multiple layers. This page documents the various caches we use inside the MediaWiki PHP application.

Généralités

Il existe deux types de dépôts décrits dans le contexte des caches d'objets de MediaWiki :

  1. Cache. A place to store the result of a computation, or data fetched from an external source (for higher access speeds). This is a "cache" in the computer science definition.
  2. Stash. A place to store lightweight data not stored anywhere else. Also known as a stash (or "hoard" of objects). These are values that are not possible (or not allowed) to be recomputed on-demand.

Terminologie

Une clé de cache est dite « vérifiable  » si le programme est capable de vérifier si la valeur n'est pas obsolète.

This applies when a key can only have one possible value, such as computing the 100th digit of Pi, could be cached under the key math_pi_digit:100. Le résultat peut être rangé de manière sécurisée sur un dépôt à accès rapide sans coordination car il ne sera jamais mis à jour ni effacé. Si le contenu du cache expire, il peut être réévalué et produit le même résultat. On applique la même chose pour enregistrer le wikicode d'une révision de page donnée. La révision 123 est apparue et aura toujours le même contenu. If the program knows the revision ID it is looking for, a cache key like revision_content:123 could also be a verifiable cache key.

Enregistrement des données structurées

MediaWiki prend en charge l'enregistrement à la fois des valeurs des primitives (booléens, entiers, chaînes) et les structures des tableaux (éventuellement imbriquées). It is also technically possible to store plain objects (stdClass) and instances of arbitrary classes, which relies on PHP serialization, but relying on this mechanism is deprecated for security reasons (T161647), and stability reasons as it is very hard to change a class in a way that doesn't break forward or backward compatibility with cached objects of that class (e.g., T264257 etc.).

Le code qui écrit ou qui lit dans un cache doit être à la fois compatible avant et arrière. Typiquement, les données en cache de la lecture du code ont la même valeur ou une valeur plus actuelle que le code qui les a placées dans le cache (impliquant la compatibilité arrière de la logique de lecture, ou la compatibilité avant de l'écriture anticipée), mais il existe deux scrénarii importants où le contraire est également souhaité. 1) During a deployment, different servers and data centers briefly run old and new versions side-by-side with the same shared database and caching services. As such, a cache may very well be written to and read from both old and new versions concurrently during this time. 2) Site operators must be able to roll back the last deployment or upgrade of the software to the previous version.

Best practice:

  • Avoid placing version constants inside cache keys. Make use of the WANObjectCache::getWithSet idiom and its "version" option, which automatically takes care of forward- and backward compatibility, including invalidating cache keys across versions of the software.
  • Avoid storing class objects. Store primitives or (nested) arrays of primitives. Classes should be converted to and from simple arrays, and stored either as those simple arrays or as a string of JSON. The encoding and serialising for this must be done by the consumer and is not done by e.g., the BagOStuff or WANObjectCache interfaces. (In the future, MediaWiki may do this automatically for classes that implement JsonUnserializable, which was introduced in MediaWiki 1.36).

Services

These are the abstract stores available to MediaWiki features, see the Uses section for examples.

Serveur local

  • Accessed through MediaWikiServices->getLocalServerObjectCache().
  • Configurable: No (automatically detected).
  • Behaviour: very fast (<0.1ms, from local memory), low capacity, not shared between application servers.

Values in this store are only kept in the local RAM of any given web server (typically using php-apcu). These are not replicated to the other servers or clusters, and have no update or purge coordination options.

If the web server does not have php-apcu (or equivalent) installed, this interface falls back to an empty placeholder where no keys are stored. It is also set to an empty interface for maintenance scripts and other command-line modes. MediaWiki supports APCu, and WinCache.

Grappe locale

  • Accessed through ObjectCache::getLocalClusterInstance().
  • Configurable: Yes, via $wgMainCacheType.
  • Behaviour: fast (~1ms, from service memory), medium capacity, shared between application servers but not replicated across data centers.

Mostly for internal use only, to offer limited coordination of actions within a given data centre. This uses the same storage backend as WAN cache, but under a different key namespace, and without any ability to broadcast purges to other data centres.

The local cluster cache is typically backed by Memcached, but may also use the database.

Cache WAN

  • Accessed through MediaWikiServices->getMainWANObjectCache().
  • Configurable: Yes, via $wgMainWANCache, which defaults to $wgMainCacheType.
  • Behaviour: fast (~1ms, from service memory), medium capacity, shared between application servers, with invalidation events being replicated across data centers

Values in this store are stored centrally in the current data centre (typically using Memcached as backend). While values are not replicated to other clusters, "delete" and "purge" events for keys are broadcasted to other data centres for cache invalidation. See WANObjectCache class reference for how to use this.

In short: Compute and store values via the getWithSet method. To invalidate caches, use key purging (not by setting a key directly).

See also WANObjectCache on wikitech.wikimedia.org.

Réserve principale

  • Accessed through MediaWikiServices->getMainObjectStash().
  • Configurable: Yes, via $wgMainStash.
  • Behaviour: may involve disk read (1-10ms), semi-persistent, shared between application servers and replicated across data centers.

Values in this store are stored centrally in the primary data centre, and later replicated to other data centres. It typically uses MySQL or Redis as backend. By default, the objectcache table is used. Keys are generally be read from a local replica and may be lagged. Primary DB reads can be done using BagOStuff::READ_LATEST, but must not happen during GET requests.

This store is expected to have strong persistence and is often used for data that cannot be regenerated and is not stored elsewhere. However the data stored here must be non-critical and result in minimal user impact if lost, thus allowing for the backend to sometimes be partially unavailable or wiped if under operational pressure without causing incidents.

Usages

Enregistrement des sessions

  • Accessed via Session objects, which itself is accessed via SessionManager, or RequestContext->getRequest()->getSession()
  • Configured via $wgSessionCacheType .

This is not really a cache, in the sense that the data is not stored elsewhere.

Cache interwiki

See Interwiki cache for details, and also ClearInterwikiCache.php.

Cache de l'analyseur syntaxique

  • Accessed via the ParserCache class.
  • Backend configured by $wgParserCacheType (typically MySQL).
  • Keys are canonical by page ID and populated when a page is parsed.
  • Revision ID is verified on retrieval.

See Manual:Parser cache for details. See also purgeParserCache.php.

Cache des messages

Texte des révisions

  • Accessed via SqlBlobStore::getBlob.
  • Stored in the WAN cache, using key class SqlBlobStore-blob.
  • Keys are verifiable and values immutable. Cache is populated on demand.

Contexte

The main use case for caching revision text (as opposed to fetching directly from the text table or External Storage) is for handling cases where the text of many different pages is needed by a single web request.

Utilisé initialement par :

  • Parsing wikitext. When parsing a given wiki page, the Parser needs the source of the current page, but also recursively needs the source of all transcluded template pages (and Lua module pages). It is not unusual for a popular article to indirectly transclude over 300 such pages. The use of Memcached saves time when saving edits and rendering page views.
  • MessageCache. This is a wiki-specific layer on top of LocalisationCache, which consists primarily of message overrides from "MediaWiki:"-namespace pages on the given wiki. When building this blob, the source text of many different pages needs to be fetched. This is cached per-cluster in Memcached, and locally per-server (to reduce Memcached bandwidth ; r11678, commit 6d82fa2).

Exemple

Key WANCache:v:global:SqlBlobStore-blob:<wiki>:<content address>.

"content address" refers to the content.content_address on the wiki's main database (e.g. "tt:1123"). This in turn refers to the text table or (External Storage).

To reverse engineer which page/revision this relates to, Find content.content_id for the content address (SELECT content_id FROM content WHERE content_address = "tt:963546992";), then find the revision ID for that content slot (SELECT slot_revision_id FROM slots WHERE slot_content_id = 943285896;),

The revision ID can then be used on-wiki in a url like https://en.wikipedia.org/w/index.php?oldid=951705319, or you can look it up in the revision and page tables.

Métadonnées des révisions

  • Accessed via RevisionStore::getKnownCurrentRevision.
  • Stored in the WAN cache, using key class revision-row-1.29.
  • Keys are verifiable (by page and revision ID) and values immutable. Cache is populated on demand.

MessageBlobStore

Stores interface text used by ResourceLoader modules. It is similar to LocalisationCache, but includes the wiki-specific overrides. (LocalisationCache is wiki-agnostic). These overrides come from the database as wiki pages in the MediaWiki-namespace.

  • Accessed via MessageBlobStore.
  • Stored in the WAN cache, using key class MessageBlobStore.
  • Keys are verifiable (by ResourceLoader module name and hash of message keys). Values are mutable and expire after a week. Cache populated on demand.
  • All keys are purged when LocalisationCache is rebuild. When a user save a change to a MediaWiki-namespace page on the wiki, a subset of the keys are also purged.

Cache des vignettes

ResourceLoader caches the minified versions of raw JavaScript and CSS input files.

  • Accessed via ResourceLoader::filter.
  • Stored locally on the server (APCu).
  • Keys are verifiable (deterministic value). No purge strategy needed. Cache populated on demand.

Cache des compilations LESS

ResourceLoader met en cache les métadonnées ainsi que les sorties de l'analyseur syntaxique des fichiers LESS qu'il a compilés.

  • Accessed via ResourceLoaderFileModule::compileLessFile.
  • Stored locally on the server (APCu).

Hachage de l'entête du contenu de fichier

ResourceLoader caches the checksum of any file directly or indirectly used by a module. When serving the startup manifest to users, it needs the hashes of many thousands of files. To reduce I/O overhead, it caches this content hash locally, keyed by path and mtime.

  • Accessed via FileContentsHasher.
  • Stored locally on the server (APCu).

Voir aussi