Page metadata

Over the growth of the MediaWiki software, more and more metadata have been added to pages:


 * Categories
 * Interwiki links
 * Templates
 * Copyright violation, under construction, protection, and so on
 * External links

These data are not part of the page itself, but rather information about the page. Thus moving at least some of these data out of the page might be beneficial.

Schema options for metadata storage
The most recent meta data are combined with the most recent article text to generate the complete article version. Options for storing metadata:

rv_metadata field option
The metadata can be stored in revision.rv_metadata.
 * Downside: this could require storing the same voluminous metadata with every revision, which is wasteful if the metadata didn't change.

metadata table option with page_metadata field
The metadata can be stored in a separate table, metadata, with metadata_id primary key. Each time the metadata is revised, a new metadata row is created, and the page.page_metadata is updated with the new metadata_id.
 * What about older versions of meta data and article data? How should those be combined? Probably when a revision is saved, revision.rv_metadata should be populated with the current page_metadata (i.e. the most recent metadata_id for that page). Then when you view old revisions, it will know what metadata to combine with the archived text.

Wikidata option
Store metadata in Wikidata. Users are going to have to use Wikidata anyway for interlanguage links and such, so this is no big deal.

Separate pages for metadata
Have a namespace or subpages with page metadata (Metadata:Foo or Foo/metadata for the Foo article). See Extension:ExplicitDescription.

Additional edit screen inputboxes
Have separate edit screen inputboxes for various types of metadata; see Extension:Advanced Meta for an example of this.

Metadata tags
Use tags in the page text to explicitly set forth metadata to be stored as such when the page is saved. E.g. PageDescription See Extension:MetaDescriptionTag for more examples.

Automated description extraction
Similar to what Extension:Description2 does: strip out sitenotices and such and put the article lead in the page description database field, unless there is metadata overriding this description.

Issues to be solved
Moving of articles requires moving the meta data as well. Conversion of existing meta data to new DB table representation.
 * Conversion script; done while the site is off-line. (Why would this be worth the downtime?)
 * Robot driven with user control; done on a live site.

Advantages
The wiki-language of the article remains clearer:
 * Inexperienced users can edit more freely (really? Or will separate metadata add more complications to what they have to learn?)
 * Parsing the text is simpler (how? doesn't this add more complications, since the parser has to combine another set of data with the page text?)
 * Recent changes can be identified by the type of change, and users not interested in certain types of meta data change need not bother looking. Also, certain types of changes, e.g. categories, can be followed much easier.
 * Changes to meta data can be consistently checked before committing (e.g. Categories must exist). Of course, this can be done using the current system too.
 * Page description metadata can be used for a multitude of purposes, such as generating HTML heads, did you know or featured article summaries, etc. Special:AllPages could have an option to list not only page name but the brief description. Extension:CategoryGallery could display not only all the images in a category but the image descriptions.
 * All metadata can be kept in a separate cache, again reducing overall load. (is this true?)

Disadvantages and open questions

 * The separation of the current data into 'text data' and 'meta data' is somewhat arbitrary, thus potentially confusing.
 * Maintaining an overall revision history becomes more complicated, since the article history is spread over 'text-data' and 'meta-data' histories. (See Page_metadata for an explanation of how this would be done.)