Object cache/zh

MediaWiki在多个组件和多个层中使用缓存. 本页记录了我们在MediaWiki PHP应用程序中使用的各种缓存.

概述
MediaWiki中的对象缓存描述了两种存储：


 * 1) 缓存. 存储计算结果或从外部源获取的数据的地方（用于更高的访问速度）.  这是计算机科学定义中的“缓存”.
 * 2) 贮存. 一个存储轻量级数据的地方，而不是存储在其他地方.  也称为藏匿（或“囤积”物品）.  这些值不可能（或不允许）按需重新计算.

术语
如果程序能够验证该值是否过期，则称缓存密钥为“可验证”.

当一个密钥只能有一个可能的值时（例如计算π的第100位），这适用于可以缓存在密钥 下的情况. 结果可以安全地存储在高速访问存储中，无需协调，因为它永远不需要更新或清除. 如果它从缓存中过期，可以重新计算并产生相同的结果. 这同样适用于将某个版本的Wikitext存储到页面. 版本123已经被创建，并且将始终包含相同的内容. 如果程序知道它正在查找的修订ID，那么像 这样的缓存密钥也可以是可验证的缓存密钥.



存储结构化数据
MediaWiki支持存储数组的原始值（bool、int、string）和（可能是嵌套的）结构. 技术上也可以存储普通对象（stdClass）和任意类的实例，这依赖于PHP序列化，但出于安全原因（T161647）和稳定性原因，不推荐使用这种机制，因为很难以不破坏与该类缓存对象的前向或后向兼容性的方式更改类（例如T264257等）.

写入或读取缓存的代码必须前后兼容. 通常，读取缓存数据的代码将具有与写入缓存数据的相同或更新的代码（需要向后兼容的读取逻辑，或提前转发兼容的写入），但有两种重要情况也需要相反的情况：
 * 1) 在部署过程中，不同的服务器和数据中心使用相同的共享数据库和缓存服务短暂地并行运行旧版本和新版本. As such, a cache may very well be written to and read from both old and new versions concurrently during this time.
 * 2) Site operators must be able to roll back the last deployment or upgrade of the software to the previous version.

Best practice:


 * Avoid placing version constants inside cache keys. Make use of the  idiom and its "version" option, which automatically takes care of forward- and backward compatibility, including invalidating cache keys across versions of the software.
 * Avoid storing class objects. Store primitives or (nested) arrays of primitives. Classes should be converted to and from simple arrays, and stored either as those simple arrays or as a string of JSON. The encoding and serialising for this must be done by the consumer and is not done by e.g., the BagOStuff or WANObjectCache interfaces. (In the future, MediaWiki may do this automatically for classes that implement JsonUnserializable, which was introduced in MediaWiki 1.36).

Services
These are the abstract stores available to MediaWiki features, see the Uses section for examples.

Local server

 * Accessed through.
 * Configurable: No (automatically detected).
 * Behaviour: very fast (<0.1ms, from local memory), low capacity, not shared between application servers.

Values in this store are only kept in the local RAM of any given web server (typically using php-apcu). These are not replicated to the other servers or clusters, and have no update or purge coordination options.

If the web server does not have php-apcu (or equivalent) installed, this interface falls back to an empty placeholder where no keys are stored. It is also set to an empty interface for maintenance scripts and other command-line modes. MediaWiki supports APCu, and WinCache.

Local cluster

 * Accessed through.
 * Configurable: Yes, via $wgMainCacheType.
 * Behaviour: fast (~1ms, from service memory), medium capacity, shared between application servers but not replicated across data centers.

Mostly for internal use only, to offer limited coordination of actions within a given data centre. This uses the same storage backend as WAN cache, but under a different key namespace, and without any ability to broadcast purges to other data centres.

The local cluster cache is typically backed by Memcached, but may also use the database.

WAN cache

 * Accessed through.
 * Configurable: Yes, via $wgMainWANCache, which defaults to $wgMainCacheType.
 * Behaviour: fast (~1ms, from service memory), medium capacity, shared between application servers, with invalidation events being replicated across data centers

Values in this store are stored centrally in the current data centre (typically using Memcached as backend). While values are not replicated to other clusters, "delete" and "purge" events for keys are broadcasted to other data centres for cache invalidation. See WANObjectCache class reference for how to use this.

In short: Compute and store values via the  method. To invalidate caches, use key purging (not by setting a key directly).

See also WANObjectCache on wikitech.wikimedia.org.

Main stash

 * Accessed through.
 * Configurable: Yes, via $wgMainStash.
 * Behaviour: may involve disk read (1-10ms), semi-persistent, shared between application servers and replicated across data centers.

Values in this store are read and written in the same data centre, with writes expected to be replicated to and from other data centres. It typically uses MySQL or Redis as backend. By default, the table is used. It must be tolerated that reads can potentially be stale, for example due to bried unavailability of cache writes, or race conditions where overlapping requests finish out of order, or due to writes from another data center taking a second to replicate.

This store is expected to have strong persistence and is often used for data that cannot be regenerated and is not stored elsewhere. However, data stored in the MainStash must be non-critical and result in minimal user impact if lost, thus allowing for the backend to sometimes be partially unavailable or wiped if under operational pressure without causing incidents.

Session store
This is not really a cache, in the sense that the data is not stored elsewhere.
 * Accessed via  objects, which itself is accessed via SessionManager, or
 * Configured via.

Interwiki cache
See Interwiki cache for details, and also.

Parser cache
See Manual:Parser cache for details. See also purgeParserCache.php.
 * Accessed via the  class.
 * Backend configured by (typically MySQL).
 * Keys are canonical by page ID and populated when a page is parsed.
 * Revision ID is verified on retrieval.

Message cache

 * Access via.
 * Backend configurable by $wgMessageCacheType (defaults to $wgMainCacheType, with fallback to MySQL).

Revision text

 * Accessed via.
 * Stored in the WAN cache, using key class.
 * Keys are verifiable and values immutable. Cache is populated on demand.

Background
The main use case for caching revision text (as opposed to fetching directly from the  table or External Storage) is for handling cases where the text of many different pages is needed by a single web request.
 * Originally implemented in 2006 (, commit 376014e).
 * Process cache added in 2016.
 * Adopted by MessageCache in 2017.

This is primarily used by:


 * Parsing wikitext. When parsing a given wiki page, the Parser needs the source of the current page, but also recursively needs the source of all transcluded template pages (and Lua module pages). It is not unusual for a popular article to indirectly transclude over 300 such pages. The use of Memcached saves time when saving edits and rendering page views.
 * MessageCache. This is a wiki-specific layer on top of LocalisationCache, which consists primarily of message overrides from "MediaWiki:"-namespace pages on the given wiki. When building this blob, the source text of many different pages needs to be fetched. This is cached per-cluster in Memcached, and locally per-server (to reduce Memcached bandwidth ;, commit 6d82fa2).

Example
Key.

"content address" refers to the  on the wiki's main database (e.g. "tt:1123"). This in turn refers to the table or (External Storage).

To reverse engineer which page/revision this relates to, Find  for the content address, then find the revision ID for that content slot.

The revision ID can then be used on-wiki in a url like https://en.wikipedia.org/w/index.php?oldid=951705319, or you can look it up in the revision and page tables.

Revision meta data

 * Accessed via.
 * Stored in the WAN cache, using key class.
 * Keys are verifiable (by page and revision ID) and values immutable. Cache is populated on demand.

MessageBlobStore
Stores interface text used by ResourceLoader modules. It is similar to LocalisationCache, but includes the wiki-specific overrides. (LocalisationCache is wiki-agnostic). These overrides come from the database as wiki pages in the MediaWiki-namespace.


 * Accessed via.
 * Stored in the WAN cache, using key class.
 * Keys are verifiable (by ResourceLoader module name and hash of message keys). Values are mutable and expire after a week. Cache populated on demand.
 * All keys are purged when LocalisationCache is rebuild. When a user save a change to a MediaWiki-namespace page on the wiki, a subset of the keys are also purged.

Minification cache
ResourceLoader caches the minified versions of raw JavaScript and CSS input files.
 * Accessed via.
 * Stored locally on the server (APCu).
 * Keys are verifiable (deterministic value). No purge strategy needed. Cache populated on demand.

LESS compilation cache
ResourceLoader caches the meta data and parser output of LESS files it has compiled.


 * Accessed via.
 * Stored locally on the server (APCu).

File content hasher
ResourceLoader caches the checksum of any file directly or indirectly used by a module. When serving the startup manifest to users, it needs the hashes of many thousands of files. To reduce I/O overhead, it caches this content hash locally, keyed by path and mtime.


 * Accessed via.
 * Stored locally on the server (APCu).