Picking the right cache

From mediawiki.org

MediaWiki has a variety of caching and persistence layers. Each layer has its own advantages and disadvantages, uses and misuses. In general, when choosing where to store data in MediaWiki, you should take into consideration the following:

  • Is the data original, or is it generated from other data?
    • How long does it take to regenerate the data?
  • How big is the data, and what format is it in?
  • For how long is the data expected to be stored and retrievable?

The goal of this document is to force you to answer these questions, and then provide suggestions on where your data should be stored in MediaWiki.

Summary[edit]

Layer name Description Structured Persistent Per-user
LocalStorage This is a client-side storage layer in the user's browser. No Yes Yes
Browser cache This is an HTML-only cache implemented in the user's browser. It caches web pages delivered by MediaWiki, and can be controlled via caching headers sent to the browser No No Yes
CDN cache This is server-side cache proxy that is implemented by a software separate from MediaWiki (e.g. Varnish at WMF). It stores copies of web responses from MediaWiki, such as HTML, JS and CSS. No No No
Object cache These are server-side key-value caches within MediaWiki. They are accessible via a service class, such as LocalServerObjectCache, MainWANObjectCache , and MainObjectStash (see Manual:Object cache)

Each of these is backed by a BagOStuff subclass, abstracting php-apcu, Memcached, Redis, or MySQL. See also Memcached at WMF.

No No Yes

Interface v. implementation[edit]

It is important to note that there is a fundamental separation between the interface of a caching layer and its implementation. The difference between the two is that the interface is the contract between the programmer and MediaWiki. When a caching layer has a certain property in its interface, it is guaranteed to follow that property.

For example, the object cache layer is listed in the above summary as being "not persistent". This is an interface detail. It means that when you use that caching layer, you should expect data to be erased at any moment. However, it is implementation-defined as to when and how data is erased. In other words, it is actually possible for a given implementation of the object cache layer to be persistent! All it has to do is never erase data. However, nonetheless, programmers should still consider it to be non-persistent, and use the layer as if data could disappear at any moment.

In MediaWiki, every caching layer has a number of implementations. Each implementation functions differently, but is guaranteed to follow the contract of the layer. Here are some example implementations for each layer:

LocalStorage and Browser cache
Google Chrome
Firefox
HTTP cache
Varnish
Squid
Object cache
Memcached
Redis
Session data
Built-in PHP file-based sessions
Redis
SQL store
MySQL
MariaDB
PostgreSQL

Notice that some implementations can be used for different layers, e.g., Redis functions as both an object cache, a session data store, and a key-value store. This is because of the idea explained above. Since an object cache can be persistent (it is just assumed not to be), Redis can be used to implement it.

Browser-level storage[edit]

The first step in caching is the user's browser. The browser has LocalStorage (among others, such as WebSQL, which MediaWiki does not use at the moment) and an HTML page cache. Some properties of browser caches are:

  • They are not just per-user, but per-browser, meaning a user can change computers and the cache will be different. This cache should only be used for data that can be re-fetched from the server if necessary.
  • They are entirely client-side. Thus there is no download necessary, and accessing the cache is very fast for the user. Of course, this means that you do not have access to any server-side resources.
  • The vary greatly between implementations. Unfortunately, not every browser implements caching the same way. There are standard interfaces, as set forth by W3, and in many cases the browser will stick to the standard, but always expect the unexpected.

A notable user of LocalStorage in MediaWiki by default is ResourceLoader, which uses it to cache JavaScript bundles. Read more about this at ResourceLoader/Architecture#Store.

Object cache[edit]

The object cache is perhaps the most used caching layer in MediaWiki. It is usually implemented by either Memcached or Redis, and functions as a quick way to cache results of expensive operations. This layer is not persistent, and you should expect data stored in this layer to disappear at any time, without warning! Of course, since it is a cache, the data will not disappear immediately, and in general you can expect the object cache to store data for a fairly long period of time.