Multi-Content Revisions

Multi-Content Revisions (or MCR) refers to the ability of MediaWiki to store multiple content objects in a single revision of a page.

See Requests_for_comment/Multi-Content_Revisions for the original proposal and technical specification.

What does MCR do?
MCR provides a way to store content in multiple slots on a page. The content may all be of the same kind (use the same content model), or be of different kinds. This can be thought of like attachments on an email.

Slots may be changed separately or together (atomically). Every change to a slot is recorded as an edit to the page, and will show as such in the page’s history. If a slot is not touched by a given edit, it stays unchanged (that is, the slot’s content is inherited from the parent revision).

MCR allows additional data to be integrated with page content in a way that makes it just work with page moves, protection, watching, deletion, diffing, re-rendering, caching, etc.

Is MCR complete?
The storage mechanism for MCR is complete and has been in production since 2019. The migration of the database schema on Wikimedia systems has been completed in 2020, support for the old schema has been removed in the 1.35 release.

The original vision for MCR included an easy way for extensions to define where the additional content would be shown on the page, and how it would be edited. As of 2020, this part of the vision has not been implemented since it was not needed for the initial use case (Structured Data on Commons). A generalized editing mechanism also seemed conceptually  questionable, especially for content models that are not text based and require an interactive user interface for editing.

How does MCR scale?
Since MCR allows more kinds of content to be stored on a page, one might expect it to lead to a need to record additional edits in the database. Typically however, the information that is recorded in the extra slots would otherwise have been either embedded in the primary content (the wikitext), or placed on an associated page (typically a subpage). Changing this information causes an edit to be recorded in the revision table in any case.

The only additional requirement is for tracking the association between content objects and revisions in the slots table: if a page has three slots, there will be three times as many rows in the slots table for that page as in the revision table. For this reason, the information recorded in the slots table is kept to a minimum (about 25 byte per row).

It is worth noting that the initial use case for MCR (Structured Data on Commons) led to a significant increase in the number of edits. This is however due to the data model used for the additional content (Wikibase), which favors high granularity of edits. The effect would have been the same if the data had been stored on separate pages, it is unrelated to MCR.

Does MCR support structured data?
MCR just provides a way to manage multiple content objects per page, it does not know or care about how this content is structured (what model it uses). MCR would manage e.g. binary audio data as happily as wikitext or perhaps JSON.

However, MCR enables data that has previously embedded in the primary content (typically wikitext) to be managed separately. This allows such data to be stored in a more suitable form, such as JSON. MCR provides a place to store this data and integrates it with the update and rendering mechanism, but it does not provide a user interface for interacting with the data.

What could MCR be used for in the future?
MCR is designed to remove the need to embed structured data in wikitext. One example for this kind of thing is the way TemplateData places meta-data about template parameters on the template page using a special syntax. Instead, this information could be stored in a separate slot, in a machine readable form such as JSON. This would enable the creations of a specialized API and a dedicated user interface for displaying and manipulating this information. As of 2020, MCR doesn't directly help with creating that API or UI, it just removes the complexities of extracting and replacing structured data embedded in wikitext.

Another example are categories: Wikitext uses a special syntax to place pages in categories. The complex nature of the wikitext syntax makes it hard to reliably extract or change these categories. If the community decides that this should change, MCR could be used to store categories apart from the wikitext (but still as part of the same page), as a data structure that can easily be manipulated.

However, changing the way categories are managed faces some challenges in practice, due to the need to transition from the traditional system to the new system, and because of the way that templates can dynamically construct categories. So while MCR makes it simple to manage category data apart from wikitext, it doesn’t help with transitioning towards that new system, nor does it help with creating an editing interface for this new kind of data.