Page metadata

Over the growth of the MediaWiki software, more and more meta data have been added to pages:


 * Categories
 * Interwiki links
 * Templates
 * Copyright violation, under construction, protection, and so on
 * External links

These data are not part of the page itself, but rather information about the page. Thus moving these data out of the page might be beneficial.

Schema to represent meta information
For each type of meta information, a new DB table of cur and old is required. The most recent meta data are combined with the most recent article to generate the complete article version.

Issues to be solved
Moving of articles requires moving the meta data as well.

Conversion of exisiting meta data to new DB table representation.
 * Conversion script; done while the site is off-line.
 * Robot driven with user control; done on a live site.

Advantages
The wiki-language of the article remains clearer:
 * inexperienced users can edit more freely;
 * parsing the text is simpler.

New types of meta data require no new syntax; each type of meta data is recognized by its DB table and associated user interface. Thus it should become more easy to add new types of meta data.

Recent changes can be identified by the type of change, and users not interested in certain types of meta data change need not bother looking. Also, certain types of changes, e.g. categories, can be followed much easier.

Changes to meta data can be consistently checked before comitting (e.g. Categories must exist, interwiki links exist, ...)

Also, certain types of meta data may trigger additional events, like email notifications.

Example: Interwiki in Wikipedia
Currently, the language links are part of the article text, in the form CODE:Destination.

Under the new schema, there would be a interface 'Add/Modify/Remove Language Link', possibly next to the existing language links. To add, a language code can be typed in or selected from a drop-down list (clearname on the list, code being inserted), and the user fills in the destimation language name. Thus, the language editing would be a separate interface altogether, with extremely simple parsing. Also, all metadata can be kept in a separate cache, again reducing overall load.

The user may select in the preferences to see language links or not to.

Disadvantages and open questions

 * The separation of the current data into 'text data' and 'meta data' is somewhat arbitrary, thus potentially confusing.
 * Maintaining an overall revision history becomes more complicated, since the article history is spread over 'text-data' and 'meta-data' histories.