Jump to content

Manual:Modeling pages

From mediawiki.org
This page is about how pages, titles and links are modeled in MediaWiki. For information how page content is represented, see Manual:Page content models.

Historically, the concepts of pages, titles, and links have not been modeled clearly in MediaWiki. Several efforts have been made to improve and clarify the modeling, but these efforts are incomplete as of MW 1.41 (July 2023). This page provides an overview of the classes and interfaces that can be used to represent pages, titles, and links in MediaWiki.

Legacy Model

[edit]

The legacy model consists of the Title and WikiPage classes. They should both be avoided in favor of more narrow interfaces, especially in type hints of public methods.

The Title class has historically been used to represent both pages on the local wiki, and any kind of target a link may reference. For this reason, calling code cannot be sure what operations are well defined on a given Title object without performing additional checks or imposing additional assumptions and requirements. Titles may represent:

  • A regular editable wiki page on the local wiki, existing or non-existing.
  • A link to a section on an editable page. Methods intended for use on editable pages have undefined/misleading behavior.
  • A special page on the local wiki. Methods intended for use on editable pages have undefined/misleading behavior.
  • An interwiki (or inter-language) link. Methods intended for use on editable pages have undefined/misleading behavior.
  • A relative section jump on the current page. Methods intended for use on editable pages have undefined/misleading behavior.
  • An invalid link target. Most methods have undefined/misleading behavior.

The WikiPage class has historically been used for interacting with the content of editable wiki pages. It used to contain the logic for updating the page table, which has mostly been extracted into other classes like PageStore and PageUpdater.

Improved Model

[edit]
UML diagram of classes that model pages and titles in MediaWiki

For this reason, the use of the Title and WikiPage classes have been discouraged since MW 1.36 (2021). Several narrow interfaces have been extracted for the use cases described above:

  • The LinkTarget interface (since MW 1.27) can represent anything a wiki-link can refer to. It is implemented by the TitleValue class and (since MW 1.42) LinkTarget is supported directly by the Parsoid wikitext parser. Examples: Special:ApiSandbox, meta:Main Page, Main Page, Non-existent page, #Service Objects.
  • The PageReference interface (since MW 1.37) represents a viewable page, like a wiki page or a special page. It is a WikiAwareEntity, so it may belong to the local wiki or another wiki that can be accessed directly on the database level. It is implemented by the PageReferenceValue class. Note that PageReference and LinkTarget are incompatible types, see the LinkTarget vs. PageReference section below. Examples: Special:ApiSandbox, Main Page, Non-existent page
  • The PageIdentity interface (since MW 1.36) represents an editable wiki page which may or may not exist. PageIdentity extends the PageReference interface, and is thus also a WikiAwareEntity. It is implemented by the PageIdentityValue class which extends PageReferenceValue. Knows the page ID if the page exists. Examples: Main Page, Non-existent page
  • The PageRecord interface (since MW 1.36) represents an existing editable wiki page, and provides access to the page's meta data. It extends the PageIdentity interface, and is thus also a PageReference and a WikiAwareEntity. It is implemented by the PageStoreRecord class which extends PageIdentityValue. Knows the page ID, and the revision ID of the top revision. Example: Main Page
Overview of the parts of MediaWiki links/titles/pages that can be represented by each class
Interface Implementation Interwiki Wiki ID Namespace ID Title text DB key Fragment Page ID Revision ID
Example en: enwiki NS_TALK New York New_York #History 6678 1164229740
LinkTarget TitleValue Yes N Yes Yes Yes Yes N N
PageReference PageReferenceValue N Yes Yes N Yes N N N
PageIdentity PageIdentityValue N Yes Yes N Yes N Yes N
PageRecord PageStoreRecord N Yes Yes[1] N Yes N Yes Yes
Title Yes Yes[2] Yes Yes[3] Yes[3] Yes Yes Yes
WikiPage N Yes[2] Yes N Yes N Yes Yes
  1. Only allows namespaces where users may create and edit pages, e.g. not NS_SPECIAL.
  2. 2.0 2.1 Only allows wiki ID referring to the local wiki.
  3. 3.0 3.1 Allows empty titles to represent relative links to a section on the "current" page.

Service Objects

[edit]

Service objects can be used to obtain instances of the value objects that represent links and pages. The most important service object is PageStore (or better, the PageLookup interface). it can be used to obtain PageReference and PageRecord instances as follows:

method input existing page non-existing page non-proper page
getPageById int $pageId returns PageRecord returns null n/a
getPageByName int $namespace, string $dbKey returns PageRecord returns null throws InvalidArgumentException
getPageByReference PageReference $page returns PageRecord returns null throws InvalidArgumentException
getPageByText string $text, int $defaultNamespace returns PageRecord returns PageIdentity returns null
getExistingPageByText string $text, int $defaultNamespace returns PageRecord returns null returns null
getPageForLink LinkTarget $link returns PageRecord returns PageIdentity throws InvalidArgumentException

For converting LinkTarget and PageReference objects to their string representation, use TitleFormatter service object:

method input form namespace interwiki fragment
formatTitle $namespace, $text, $fragment = '', $interwiki = '' display yes yes yes
getText LinkTarget|PageReference $title display no no no
getPrefixedText LinkTarget|PageReference $title display yes yes no
getPrefixedDBkey LinkTarget|PageReference $title key yes yes no
getFullText LinkTarget|PageReference $title display yes yes yes

For constructing a LinkTarget from a string, use the TitleParser service object.

Backwards Compatibility

[edit]

In order to retain backwards compatibility, the Title class implements the LinkTarget and PageReference interfaces. Similarly, WikiPage implements PageRecord. However, the intended semantics of these interfaces doesn't hold for all possible instances of Title and WikiPage:

  • Not all Title objects represent editable wiki pages, so not all PageIdentity objects are actually editable wiki pages. The ProperPageIdentity was introduced to allow code to require the guarantee that a PageIdentity is actually an editable wiki page. PageIdentityValue implements ProperPageIdentity, and instances can be obtained from Title::toPageIdentity. Once the Title class has been removed, ProperPageIdentity will become an alias for PageIdentity, which will then be guaranteed to represent an editable wiki page.
  • Not all WikiPage objects represent existing wiki pages, so not all PageRecord objects are actually existing wiki pages. The ExistingPageRecord was introduced to allow code to require the guarantee that a PageRecord is actually an existing wiki page. PageStoreRecord implements ExistingPageIdentity, and instances can be obtained from WikiPage::toPageRecord. Once the WikiPage class has been removed, ExistingPageRecord will become an alias for PageRecord, which will then be guaranteed to represent an existing wiki page.

LinkTarget vs. PageReference

[edit]

Note that PageReference and LinkTarget are incompatible interfaces, though one would expect that all PageReferences "are" link targets. However, as of MW 1.41, LinkTargets are not WikiAwareEntites. They can only represent links that originate in the local wiki. For this reason, many methods accept both types interchangeably (they accept the union type PageReference|LinkTarget).

Removing this incompatibility would require LinkTarget to become a WikiAwareEntity, so LinkTarget and PageReference could share a base class. However, this is not trivial: WikiAwareEntities know the ID of the wiki they belong to. LinkTargets on the other hand know an interwiki prefix, which represents the wiki they refer to. The relationship between the wiki ID and the interwiki prefix can easily lead to confusion, which (as of July 2023) has prevented this issue from being resolved properly. Here is an example illustrating the issue:

While processing a request for English Wikipedia (enwiki), we load a LinkTarget from the database of English Wiktionary (enwiktionary), which has the interwiki prefix "fr". On Wikipedia, the "fr" prefix refers to French Wikipedia, but on Wiktionary, it refers to French Wiktionary! So, in order to determine the URL of the page that the LinkTarget refers to, we have to interpret the prefix "fr" in the context of the wiki that the link originates from (frwiktionary). To do this, we have to load the interwiki configuration used by frwiktionary while handling a request for enwiki.

This illustrates that, if we make LinkTargets wiki-aware, we have to be very careful about interpreting them in the right context. We would be breaking the assumption that they can always be interpreted based on the configuration of the local wiki.

On the other hand, we so far have had no need to process interwiki links defined on one wiki while interpreting a request from another wiki. Which is why so far, LinkTarget is not a WikiAwareEntity, and remains incompatible with PageReference.

The Parsoid wikitext parser is similarly not "wiki aware" as it always processes wikitext in the context of the local wiki. Parsoid uses the LinkTarget interface when communicating with core.