User:GWicke/PageProperties

The wikitext interface has traditionally mixed page content and metadata liberally. Usually these properties can be added at any point in the page and don't directly produce rendered output:


 * language links are used to render the 'languages' side bar


 * categories are rendered in a separate box below the article


 * Behavior switches [1] like and  affect the way MediaWiki renders the entire page


 * Similarly, the category default sort key [2] changes the way this page sorts within a category member list

As access to these properties is common and needs to be efficient, the PHP parser extracts this information from the page and caches these values for the latest version of each page in the page_properties table.

Similarly, the VisualEditor presents these page properties in a 'page property' dialog after extracting this information from the page content. New properties are usually appended at the end of the page content. Since diffing is still wikitext-based, this abstraction breaks down when inspecting changes.

Once bug 49143 is implemented, we'll have the capability to store page properties separately for each revision. This has a number of advantages over the current status quo:


 * It provides efficient and convenient access to page properties on both current and old revisions.


 * It reduces the page size when Parsoid-generated HTML+RDFa is used for page views. Metadata can be retrieved separately for editing (see bug 52936).


 * It frees clients from the complexity of extracting this information from the page content.


 * A (wiki-)text based interface for page properties can be provided separately from main content editing interface. Users don't have to navigate a mix of metadata and content. This is especially relevant for bulkier metadata like category- or page-specific language variant conversion rules.


 * A visual diffing interface can be developed along with HTML diffing planned for the content, so that users without wikitext knowledge can understand property changes.

This is all fairly straightforward for 'static' properties- those that are added directly in the page content and will only change when the page is edited. Properties added from transclusions however need to be handled differently:

1) They cannot be edited directly

2) They can change whenever a transclusion is re-rendered

We can reflect this by adding a 'dynamic' marker on page properties that are only added from transclusions.

LinksUpdate jobs in response to template edits are very expensive as they affect a large number of pages. To avoid overloading the systems, we currently ignore really large LinksUpdate jobs completely. In Parsoid we would like to make this more efficient by only re-expanding transclusions that used the edited template. If we stored a compact refcount for properties and outgoing links, we could avoid the need to parse existing HTML to DOM. Only affected templates need to be re-expanded and spliced into the DOM tree. The updated reference counts in page properties directly provide the list of link table updates to perform.

[1]: https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec#Behavior_switches [2]: https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec#Category_default_sort_key