Extension:Collection/XML Bridge/MWXHTML

From mediawiki.org

This page describes the structure of the XHTML export in regard to MediaWiki markup and its semantics.

Naming Conventions[edit]

We use class attributes to annotate the XHTML with semantic information if available.

To avoid conflicting names, all class names are prefixed with mwx. (e.g. class="mwx.section").

Page content and meta-data[edit]

  • <title> element reflects prefixed page name (e.g., Extension:XML);
  • <meta> elements provide the language, namespace, version, page name, and redirection target for redirect pages;
* Example?
* what about the indication of used microformats in the header?

Sections[edit]

The XML converter marks sections of pages using <div class="mwx.section" title="Header of the section">. In addition, sections include the usual XHTML header elements <h[123456]>. Sections end at the following header of same or smaller level or at the end of the containing element. Sections can be nested.

Links[edit]

In general, we use XHTML <a...> elements to mark up all links in pages and use classes (staring with "mwx.link." to describe their type:

Category and Language links are contained in a div element.


<div class="mwx.categorylinks"> 
 <a href="Category:People" class="mwx.link.category">People</a>
 ...
</div>

<div class="mwx.languagelinks"> 
 <a href="http://de.wikipedia.org/wiki/Mensch" class="mwx.link.interwiki">Mensch</a>
 ...
</div>



Other Link types:

  • mwx.link.internal : Links to resources inside the Wiki (internal links)
  • mwx.link.external : Links to resources outsite the Wiki (external links)
  • mwx.link.fragment :Links to sections of the same page (intra-page links):
  • mwx.link.interwiki : Links to pages in other wikis or other languages (interwiki links)
  • mwx.link.self : Links to the current page (self-links)
  • mwx.link.note : Links to footnotes (created through <ref> markup in wiki-text)
  • mwx.link.noteref : Backlinks from a footnote to the reference

Templates[edit]

Probably using the Object-Tag

Images[edit]

Images included in pages are marked up with XHMTL IMG- and A-elements:

  • A-element gets extra attributes: class="mwx.link.image" and links to the page describing the image resource
  • The src attribute of the IMG-element is set to the URL of the actual image
  • if the image is framed, thumbnailed or floating, it is embedded in a DIV-element (with class="mwx.image.float|frame|thumb") together with the optional image caption (class="mwx.imagecaption")
<div class="mwx.image.float">
 <a href="Image:Logo.png" class="mwx.link.image">
  <img src="/resources/images/logo.png"/>
  <span class="mwx.imagecaption">descriptive image caption</span>
 </a>
</div>

Math[edit]

The content of <math> markup is converted to MathML

Further the unmodified Latex is put in an OBJECT-TAG? or in a Data Uri

Details, Example?

Images[edit]

Idea: don´t resolve image urls but rather set their url to a service that redirects to the resolved ressource. this will be significantly faster.

Timeline, Hiero and other Extensions[edit]

are put unmodified into Object-elements if possible

details? examples?

use Data Uris ? implement a timeline server?

Magic variables[edit]

e.g., Extension:Collection/XML Bridge/MWXHTML are expanded as in MediaWiki's XHTML output

can we mark them?