Extension:Collection/XML Bridge/MWXHTML

This page describes the structure of the XHTML export in regard to MediaWiki markup and its semantics.

= Naming Conventions =

We use class attributes to annotate the XHTML with semantic information if available.

To avoid conflicting names, all class names are prefixed with mwx. (e.g. class="mwx.section").

= Page content and meta-data =


 * &lt;title&gt; element reflects prefixed page name (e.g., Extension:XML);
 * &lt;meta&gt; elements provide the language, namespace, version, page name, and redirection target for redirect pages;

* Example? * what about the indication of used microformats in the header?

= Sections =

The XML converter marks sections of pages using &lt;div class="mwx.section" title="Header of the section"&gt;. In addition, sections include the usual XHTML header elements . Sections end at the following header of same or smaller level or at the end of the containing element. Sections can be nested.

= Links =

In general, we use XHTML  elements to mark up all links in pages and use classes (staring with "mwx.link." to describe their type:

Category and Language links are contained in a div element.

Other Link types:


 * mwx.link.internal : Links to resources inside the Wiki (internal links)
 * mwx.link.external : Links to resources outsite the Wiki (external links)
 * mwx.link.fragment :Links to sections of the same page (intra-page links):
 * mwx.link.interwiki : Links to pages in other wikis or other languages (interwiki links)
 * mwx.link.self : Links to the current page (self-links)
 * mwx.link.note : Links to footnotes (created through &lt;ref&gt; markup in wiki-text)
 * mwx.link.noteref : Backlinks from a footnote to the reference

= Templates =

Probably using the Object-Tag

= Images =

Images included in pages are marked up with XHMTL IMG- and A-elements:


 * A-element gets extra attributes: class="mwx.link.image" and links to the page describing the image resource
 * The src attribute of the IMG-element is set to the URL of the actual image
 * if the image is framed, thumbnailed or floating, it is embedded in a DIV-element (with class="mwx.image.float|frame|thumb") together with the optional image caption (class="mwx.imagecaption")

= Math =

The content of &lt;math&gt; markup is converted to MathML

Further the unmodified Latex is put in an OBJECT-TAG? or in a Data Uri

Details, Example?

= Images =

Idea: don´t resolve image urls but rather set their url to a service that redirects to the resolved ressource. this will be significantly faster.

= Timeline, Hiero and other Extensions =

are put unmodified into Object-elements if possible

details? examples?

use Data Uris ? implement a timeline server?


 * http://simile.mit.edu/timeline/ timelines in JS, used in semantic mediawiki

= Magic variables = e.g., are expanded as in MediaWiki's XHTML output

can we mark them?