Parsoid/MediaWiki DOM spec

From MediaWiki.org
< Parsoid(Redirected from Parsoid/RDFa vocabulary)
Jump to: navigation, search

This page defines a MediaWiki-specific DOM based on HTML5 and RDFa. The semantics of MediaWiki-specific functionality are encoded using RDFa.

See Parsoid/HTML5 DOM with microdata for the general idea and background. This is work in progress, feel free to suggest improvements! See http://rdfa.info/ for RDFa documentation and a live parser.

Contents

RDFa structures [edit]

Global prefix mappings:

  • prefix="mw: http://mediawiki.org/rdf/"
  • Convention: Capital for types, lowercase for attributes.
  • Generally use the prefix instead of vocab definitions to avoid clashes (and allow mixing) with user-supplied RDFa. User-supplied RDFa with the mw prefix is moved to a non-clashing prefix in Parsoid.

mw:Placeholder and general client behavior [edit]

A typeof="http://mediawiki.org/rdf/Placeholder" protects DOM structures from any editing. Clients are expected to preserve / protect subtrees marked as such. Clients are also expected to preserve any DOM subtrees marked up with typeof, rel, property in the http://mediawiki.org/rdf/ namespace they don't understand. This decouples clients from Parsoid development, and lets them concentrate on editing constructs whose special semantics they understand without having to implement all possible content elements.

Images [edit]

Status: Being implemented 19:12, 27 March 2013 (UTC), see bug 46576.

Update: There are some tests being written, but...if we want to test this thoroughly (i.e. with properties and everything) we need to change how we normalize our output, or at least how we normalize for parsoid-only tests. --MarkTraceur (talk) 23:28, 5 April 2013 (UTC)

More update: Images are now almost identically represented as they're spec'd here. These patches are in review:

Thanks for your patience when it comes to this implementation, we'll get 'er merged soon. --MarkTraceur (talk) 21:16, 7 May 2013 (UTC)

In the examples below, the original size of the example image is 1941 × 220 pixels (these are the dimensions of the Foobar.jpg used in parserTests). The width and height in the DOM represent the actual scaled image height (not the bounding box dimensions specified in the wikitext). When image dimensions are modified or images with a non-default size are created, we will serialize to a square bounding box around the given width and/or height attributes.

The basic tree structure of all images, regardless of formatting options, alignment, or thumbnails, is:

<figure or span typeof="mw:Image"> <!-- or mw:Image/Thumb, mw:Image/Frame etc -->
 <a or span><img src="..."></a or span>
 <figcaption (optional)>....</figcaption>
</figure or span>

The outer <figure> and inner <figcaption> elements need to become <span> elements when the figure is rendered inline, since otherwise the HTML5 parser will interrupt a surrounding block context. The inner <a> element needs to become a span if there is no link; see see bug 44627. A "title" attribute on the <a>/an "alt" attribute on the <img> are present if (and only if) the "title="/"alt=" options are present in the wikitext markup.

Specific examples:

[[Image:Foobar.jpg]] (Note 1)

<span typeof="mw:Image" class="mw-default-size">
 <a href="./File:Foobar.jpg">
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
      width="1941" height="220">
 </a>
</span>

Without a link, we use the same basic DOM structure, but use a span instead of an a wrapper (see bug 44627):
[[Image:Foo.jpg|link=]] (Note 1)

<span typeof="mw:Image" class="mw-default-size">
 <span>
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
      width="1941" height="220">
 </span>
</span>

Adding 'left' causes the image to be rendered in block context, so the outer <span> becomes a <figure>:
[[Image:Foo.jpg|left|<p>caption</p>]] (Note 2, Note 5)

<figure typeof="mw:Image" class="mw-default-size">
 <a href="./File:Foobar.jpg">
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
      width="1941" height="220">
 </a>
 <figcaption><p>caption</p></figcaption>
</figure>

Scaling, vertical alignment of an inline image:
[[Image:Foobar.jpg|50px|middle]] (Note 1)

<span typeof="mw:Image" class="mw-valign-middle">
 <a href="./File:Foobar.jpg">
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
      width="50" height="6">
 </a>
</span>

Caption (containing disallowed markup) on an inline image:
[[Image:Foobar.jpg|500x10px|baseline|cap<div></div>tion]] (Note 2, Note 5)

<span typeof="mw:Image" class="mw-valign-baseline"
   data-mw='{"caption":"cap<div></div>tion"}'>
 <a href="./File:Foobar.jpg">
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
      width="89" height="10">
 </a>
</span>

[[Image:Foobar.jpg|50px|border|caption]] (Note 2)

<span typeof="mw:Image" class="mw-image-border"
   data-mw='{"caption":"caption"}'>
 <a href="./File:Foobar.jpg">
  <img resource="./File:Foobar.jpg" src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg"
      width="50" height="6">
 </a>
</span>

[[Image:Foobar.jpg|thumb|left|baseline|caption content]] (Note 3, Note 4)

<figure typeof="mw:Image/Thumb"
 class="mw-halign-left mw-valign-baseline mw-default-size">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="180" height="20"
       resource="./Image:Foobar.jpg" />
   </a>
   <figcaption>caption content</figcaption>
</figure>

[[Image:Foobar.jpg|thumb|50x50px|right|middle|caption]] (Note 3)

<figure typeof="mw:Image/Thumb" class="mw-halign-right mw-valign-middle">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="50" height="6"
       resource="./Image:Foobar.jpg" />
   </a>
   <figcaption>caption</figcaption>
</figure>

[[Image:Foobar.jpg|frame|caption]]

<figure typeof="mw:Image/Frame" class="mw-default-size">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="1941" height="220"
       resource="./Image:Foo.jpg" />
   </a>
   <figcaption>caption</figcaption>
</figure>

[[Image:Foobar.jpg|500x50px|frame|left|baseline|caption]]

<figure typeof="mw:Image/Frame" class="mw-halign-left mw-valign-baseline">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="442" height="50"
       resource="./Image:Foo.jpg" />
   </a>
   <figcaption>caption</figcaption>
</figure>

[[Image:Foobar.jpg|frameless|500x50px|caption]] (Note 5)

<figure typeof="mw:Image/Frameless">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="442" height="50"
       resource="./Image:Foobar.jpg" />
   </a>
   <figcaption>caption</figcaption>
</figure>

Note that "border" can be combined with "frameless".
[[Image:Foobar.jpg|frameless|500x50px|border|caption]] (Note 5)

<figure typeof="mw:Image/Frameless" class="mw-image-border">
   <a href="./File:Foobar.jpg">
     <img src="http://upload.wikimedia.org/wikipedia/commons/3/3a/Foobar.jpg" width="442" height="50"
       resource="./Image:Foobar.jpg" />
   </a>
   <figcaption>caption</figcaption>
</figure>

See enwiki help for all options, see mw for inline/float details

Note 1: The PHP parser adds a default alt attribute to the <img> tag, with content "Foobar.jpg". Client-side post-processing will need to add this for compatibility. (Parsoid does not add this attribute because it does not correspond to anything in the wikitext.)

Note 2: In this case the PHP parser adds a title attribute to the <a> and an alt attribute to the <img>, both with the value "caption". Note that this is a markup-stripped version of the supplied caption in some cases. Client-side post-processing will need to add these.

Note 3: The PHP parser adds a <a href="./File:Foo.jpg" class="internal sprite details magnify" title="View photo details"></a> element inside the <figure>. Post-processing can add this if needed by a client.

Note 4: The default thumbnail width is a user-specified preference for the PHP parser. Parsoid uses a fixed 180x180px bounding box. The "mw-default-size" class indicates "no size given" and can be used to resize thumbs according to user preferences.

Note 5: In this example, the caption is not visible in PHP output, so the there should be a rule in the default stylesheet like (IE7+ and other modern browsers):

figure[typeof~="mw:Image/Frameless"] > figcaption,
figure[typeof~="mw:Image"] > figcaption { display: none }

In the PHP parser output, the caption does appear as a title attribute on the <a> and an alt attribute on the <img>; client side post-processing should add these (unless there are existing title and alt attributes, resulting from "title=" and "alt=" properties in the wikitext).

Semantic info in HTML/RDFa [edit]
figure classes
mw-valign-{baseline,middle,sub,super,text-top,text-bottom,top,bottom}, mw-halign-{left,right,center,none} and optionally mw-image-border and mw-default-size for full-size images and thumbs scaled to the wiki's and user's default thumb size
figcaption sub-element
The caption
resource attribute on image
link to image resource page. TODO: what to use for images from commons?
width and / or height on image
scaled image size. Only one of width or height is fine for easier client-side scaling without aspect ratio issues.
alt attribute on image
alt property
src attribute on image
thumb governed by explicit thumb option or implicit from image
href attribute on a around image
link target, normally just the image page- BUT a element can be absent if link is explicitly empty.

Wiki links [edit]

  • The href attribute is UTF8 (as everything else), with a relative link prefix that always navigates up to the top of the wiki namespace, especially in subpages / pages containing slashes in the title. Example: './Foo', or (in a subpage) './../Foo'. We percent-encode percents and question marks in hrefs to support following links to wiki pages with question marks in their name. On the way in (when posting HTML to Parsoid) we assume href values to be urlencoded and decode them during serialization. Modified link hrefs without ./ or ../ prefix are temporarily assumed to be absolute to the wiki namespace for now, but will also be interpreted as relative to the page soon to support relative links in other HTML content. After that change, the equivalent of an absolute wikilink [[Foo]] would need to return an href="/Foo" instead.

[[Main Page|alternate linked content]]

<a rel="mw:WikiLink" href="./Main_Page">alternate linked content</a>


[[Main Page]]

<a rel="mw:WikiLink" href="./Main_Page">Main Page</a>

Link with tail: [[Potato]]es

<a rel="mw:WikiLink" href="./Potato">Potatoes</a>

Category links [edit]

[[Category:Foo]]

<link rel="mw:WikiLink/Category" href="./Category:Foo">

[[Category:Foo|Bar baz#quux]]

<link rel="mw:WikiLink/Category" href="./Category:Foo#Bar baz%23quux">

Language links [edit]

[[en:Foo]]

<link rel="mw:WikiLink/Language" href="http://en.wikipedia.org/wiki/Foo">

Interwiki non-language links [edit]

Status: In development / not yet implemented! See bug 42160.

[[:en:Foo]]

<a rel="mw:WikiLink/Interwiki" href="http://en.wikipedia.org/wiki/Foo">en:Foo</a>

External links [edit]

Autolinked URLs [edit]

http://example.com

<a rel="mw:ExtLink/URL" href="http://example.com">http://example.com</a>

Numbered external link [edit]

[http://example.com]

<a rel="mw:ExtLink/Numbered" href="http://example.com">[1]</a>

Named external link [edit]

[http://example.com Link content]

<a rel="mw:ExtLink" href="http://example.com">Link content</a>

Magic links [edit]

ISBN link [edit]

ISBN 978-1413304541

<a rel="mw:ExtLink/ISBN"
  resource="urn:ISBN:978-1413304541"
   href="./Special:BookSources/9781413304541"
   about="#_ISBN-978-1413304541-1234"
   class="internal mw-magiclink-isbn">
  ISBN 978-1413304541
</a>

RFC link [edit]

RFC 1945

<a rel="mw:ExtLink/RFC" 
   href="http://tools.ietf.org/html/rfc1945"
   resource="urn:ietf:rfc:1945"
   about="#_RFC-1945-1234"
   class="external mw-magiclink-rfc">
  RFC 1945
</a>

PMID link [edit]

PMID 20610307

<a rel="mw:ExtLink/PMID" 
   resource="http://purl.org/commons/html/pmid/20610307"
   href="//www.ncbi.nlm.nih.gov/pubmed/20610307?dopt=Abstract"
   about="#_:PMID-20610307-1234"
   class="external mw-magiclink-pmid">
  PMID 20610307
</a>


Nowiki blocks [edit]

There are two options to handle nowiki editing:

  1. Strip the tags from the DOM and let the serializer add those that are needed after each edit
  2. Keep them in the DOM for more accurate round-tripping of manually created nowiki blocks, and prevent non-text content from being entered into these blocks in the editor (TODO)

We picked option 2 for now. The nowiki content remains editable. If the content is modified in a way that makes nowiki unnecessary Parsoid can remove the wrapper in the serializer.

<nowiki>[[foo]]</nowiki>

<span typeof="mw:Nowiki">[[foo]]</span>

HTML entities [edit]

œ

<span typeof="mw:Entity">œ</span>

Behavior switches [edit]

Help:Magic_words#Behavior_switches. Not yet implemented, tracked in bugzilla:37909.

__NOTOC__

<meta property="mw:PageProp/notoc">

__FORCETOC__

<meta property="mw:PageProp/forcetoc">

__NEWSECTIONLINK__

<meta property="mw:PageProp/newsectionlink">

__NONEWSECTIONLINK__

<meta property="mw:PageProp/nonewsectionlink">

__NOGALLERY__

<meta property="mw:PageProp/nogallery">

__HIDDENCAT__

<meta property="mw:PageProp/hiddencat">

__NOCONTENTCONVERT__

<meta property="mw:PageProp/nocontentconvert">

__NOCC__

<meta property="mw:PageProp/nocc">

__NOTITLECONVERT__

<meta property="mw:PageProp/notitleconvert">

__NOTC__

<meta property="mw:PageProp/notitleconvert">

__NOEDITSECTION__

<meta property="mw:PageProp/noeditsection">

__NOINDEX__

<meta property="mw:PageProp/noindex">

__INDEX__

<meta property="mw:PageProp/index">

__STATICREDIRECT__

<meta property="mw:PageProp/staticredirect">

Category default sort key [edit]

See bug 46470. Status: ready for implementation.

{{DEFAULTSORT:foo}}

<meta property="mw:PageProp/categorydefaultsort" content="foo">

Redirects [edit]

Status: In discussion, see bug 45808.

#REDIRECT [[foo]]

<link rel="mw:PageProp/redirect" href="./Foo">

#REDIRECT [[:Category:Foo]]

<link rel="mw:PageProp/redirect" href="./Category:Foo">

#REDIRECT [[Category:Foo]]

<link rel="mw:PageProp/redirect" href="./Category:Foo">
<link rel="mw:WikiLink/Category" href="./Category:Foo">

Template content [edit]

Status: Ready for implementation, see bug 44555.

Many parameters contain arbitrary wikitext, styles, template names and other non-semantic / DOM strings. We also have very little information which attributes are semantic and which are presentational. For now, we will thus expose all attributes in a simple JSON attribute:

{{foo|unused value|key=used value}}

<body prefix="mw: http://mediawiki.org/rdf/
      mwns10: http://en.wikipedia.org/wiki/Template%58">
 
<span typeof="mw:Object/Template" about="#mw-t1" id="mw-t1"
  data-mw='{"target":{"wt":"foo"},"params":{"1":{"wt":"unused value"},"key":{"wt":"used value"}}}'>
  Some text content
</span>
<table about="#mw-t1">
  <tr>
    <td>used value</td>
  </tr>
</table>
</body>

The (optional) id property will let us associate inline parameters with the JSON data later. This lets us support inline editing of things like infobox parameters in the future without changes to the JSON data structure.

Editing compound content blocks that include output from several templates like this football table would benefit from access to interspersed content from the surrounding page. We will implement this by interspersing wikitext strings with template information in the data-mw.parts array:

{{table-start}}
{{cell|unused value|key=used value}}
|-
{{cell|unused value|key=used value}}
|-
|<math>1+1</math>
|}
<span typeof="mw:Object/Extension/Sanitize" about="#mw-t1" id="mw-t1"
  data-mw='{"parts":
[
  {"template":{"target":{"wt":"Template:Table-start"}}},
  "\n",
  {"template":{"target":{"wt":"Template:Cell"},"params":{"1":{"wt":"unused value"},"key":{"wt":"used value"}}}},
  "\n|-\n",
  {"template":{"target":{"wt":"Template:Cell"},"params":{"1":{"wt":"unused value"},"key":{"wt":"used value"}}}},
  "\n|-\n|",
  {"extension":{"name":"math","body":{"extsrc":"1+1","mathml":"<maybelater/>"}}},
  "|}"
]}'>
  Some text content
</span>
<table about="#mw-t1">
  <tr>
    <td>used value</td>
  </tr>
</table>


Editing support for the interspersed wikitext is difficult to implement on the server side, as those wikitext edits need to be restricted in their effect to the original DOM range. A potential solution to this could be to wrap the multi-template compound block into a template hook that expands its content to a well-balanced DOM structure. Arbitrary wikitext edits within this tag would still only affect the original DOM range, both in Parsoid and the PHP parser. This is lower priority though, so for now the interspersed wikitext will be read-only.

Parameter Substitution at the top-level [edit]

This section specifies wrapping for parameter uses in the top-level namespace where all parameter substitutions evaluate to a null value.

{{{foo|''some italic'' plain text '''some bold'''}}}

<body prefix="mw: http://mediawiki.org/rdf/ mwns10: http://en.wikipedia.org/wiki/">
 
<p typeof="mw:Object/Param" property="mwns10:#foo" about="#mwt0">
  <i>some italic<i> plain text <b>some bold</b>
</p>

Templates in attributes [edit]

[[Fo{{Echo|o}}|Some text content]]

<body prefix="mw: http://mediawiki.org/rdf/
      mwt0: http://en.wikipedia.org/wiki/Template%58">
 
<a href="./Foo" typeof="mw:ExpandedAttrs/Template" 
  about="#mw-t1" id="mw-t1">
  Some text content
</a>
<meta about="#mw-t1" property="mw:objectAttrKey#href" content="Fo&ltspan typeof=&quot;mw:Object/Template&quot; about=..&gt;&lt;meta...">
</body>

<div style="{{echo|color:red;}}">...</div>

<body prefix="mw: http://mediawiki.org/rdf/
      mwt0: http://en.wikipedia.org/wiki/Template%58">
 
<div style="color:red;" typeof="mw:ExpandedAttrs/Template" about="#mw-t1" id="mw-t1">
...
</div>
<meta about="#mw-t1" property="mw:objectAttrVal#style" content="...">
</body>


The exact content of the attribute content for editing purposes could be serialized HTML DOM. Alternatively we could include that directly as a sub-dom in a div-wrapped section at the start or end of the document.

Extension content [edit]

Status: In discussion.

<ref>{{Cite|foo|bar=baz}}</ref>

<span id="cite_ref-0-0" class="reference" about="#mwt1" typeof="mw:Object/Extension/Ref"
  data-mw='{"name":"ref",
            "body":{
               "html":"&lt;span typeof=\"mw:Object/Template\" about=\"#mw-t2\" id=\"mw-t2\"
                           data-mw=&apos;{
                               \"target\": {\"wt\":\"Cite\"},
                               \"params\": {\"1\":{\"wt\":\"foo\"},\"bar\":{\"wt\":\"baz\"}},
                               \"id:\": \"t1234\"
                           }&apos; &gt;
                       The citation content
                       &lt;/span&gt;"
                }
           }'>
  <a data-type="hashlink" href="#cite_note-0">[1]</a>
</span>

<math>1+1</math>

<span about="#mwt1" typeof="mw:Object/Extension/Math"
  data-mw='{"extsrc":"1+1"}' about="#mwt1">
  1 + 1
</span>

The data-mw attribute is a JSON object. It is meant as an extensible public interface, so more top-level members can be added. The top-level structure depends on the content type, with the main types being templates and extensions. See also the template content section.

The following formats are valid:

wt
raw wikitext, currently provided practically everywhere
extsrc
raw extension body text, used as a fallback when no more specialized parser is available/known.
html
parsoid-format HTML (editable using a sub-instance of VE)

In the future, more than one format might be present to provide alternate representations of the content. For example, if there is a experimental editor for mathml, the <math> extension might have both mathml and extsrc formats in the data-mw attribute. Brave users can use the new editor on the mathml content; other users will continue to use the raw-text editor on the extsrc content.

Ref and References [edit]

Status: In discussion.

First one <ref>One</ref>
Second one <ref>Two <p>p1</p> <p>p2</p> </ref>
Named one <ref name='three'>Three</ref>
Reused <ref name='three' />
Reused again <ref name='three' />

<references />

Here is one possible way to markup refs and references (sample wikitext above and sample HTML below). I have left the ids and href naming convention identical to how we currently generate it (which itself derives from the php implementation). Content with block content is wrapped in a div and content with inline content is wrapped in a span.

<p>
First one 
<span id="cite_ref-1-0" about="#mwt2"
    typeof="mw:Object/Ext/Ref" 
    data-mw='{"body":{"html":"One"}}'><a
        href="#cite_note-1" rel="dc:references">[1]</a>
</span>
Second one 
<span id="cite_ref-2-0" about="#mwt4" typeof="mw:Object/Ext/Ref"
    data-mw='{"body":{"html":"Two <p>p1</p> <p>p2</p> "}}'><a
        href="#cite_note-2" rel="dc:references">[2]</a>
</span>
Named one 
<span id="cite_ref-three-3-0" about="#mwt6"
    typeof="mw:Object/Ext/Ref"
    data-mw='{"body":{"html":"Two <p>p1</p> <p>p2</p> "},"name":"three"}'><a 
        href="#cite_note-three-3" rel="dc:references">[3]</a>
</span>
Reused 
<span id="cite_ref-three-3-1" about="#mwt8"
    typeof="mw:Object/Ext/Ref"
    data-mw='{"body":null,"name":"three"}'><a 
        href="#cite_note-three-3" rel="dc:references">[3]</a>
</span>
Reused again 
<span id="cite_ref-three-3-2" about="#mwt10"
    typeof="mw:Object/Ext/Ref" 
    data-mw='{"body":null,"name":"three"}'><a 
        href="#cite_note-three-3" rel="dc:references">[3]</a>
</span>
</p>
 
<ol about="#mwt11" typeof="mw:Object/Ext/References">
    <li about="#cite_note-1" id="cite_note-1">
        <span rel="mw:referencedBy">
            <a href="#cite_ref-1-0"></a>
        </span>
        <span>One</span>
    </li>
    <li about="#cite_note-2" id="cite_note-2">
        <span rel="mw:referencedBy">
            <a href="#cite_ref-2-0"></a>
        </span>
        <div>Two <p>p1</p> <p>p2</p> </div>
    </li>
    <li about="#cite_note-three-3" id="cite_note-three-3">
        <span rel="mw:referencedBy"><a href="#cite_ref-three-3-0">3.0</a>
            <a href="#cite_ref-three-3-1">3.1</a>
            <a href="#cite_ref-three-3-2">3.2</a>
        </span>
        <span>Three</span>
    </li>
</ol>

This results in an RDF graph like this (courtesy of http://rdfa.info/play/): Citation rdfa.png

noinclude / includeonly / onlyinclude [edit]

Not yet implemented, tracked in bugzilla:40305. We only care about these in the actual page context, not in transcluded pages / templates. foo<noinclude>bar</noinclude>baz

<body prefix="mw: http://mediawiki.org/rdf/
      mwt0: http://en.wikipedia.org/wiki/Template%58"> 
<p>foo<meta typeof="mw:NoInclude">bar<meta typeof="mw:NoInclude/End">baz</p>
</body>

foo<onlyinclude>bar</onlyinclude>baz

<body prefix="mw: http://mediawiki.org/rdf/
      mwt0: http://en.wikipedia.org/wiki/Template%58"> 
<p>foo<meta typeof="mw:OnlyInclude">bar<meta typeof="mw:OnlyInclude/End">baz</p>
</body>


foo<includeonly>bar</includeonly>baz

<body prefix="mw: http://mediawiki.org/rdf/
      mwt0: http://en.wikipedia.org/wiki/Template%58"> 
<p>foo<meta typeof="mw:IncludeOnly">baz</p>
</body>

Language conversion blocks [edit]

See bug 41716. Status: provisional / strawman.

foo-{bar baz}- quux

<p>
  foo<meta typeof="mw:LanguageConvert">bar baz<meta typeof="mw:LanguageConvert/End"> quux
</p>

Meta tags can handle unbalanced conversion blocks, which are supported in the PHP parser. The downside is that moving content around won't preserve the language conversion block in the visual editor. A more robust alternative would be to use an attribute-based mechanism somewhat similar to template encapsulation. This would preserve the conversion property when part of the content is moved around the page. The general problem is very similar to noinclude and onlyinclude sections, so we should probably find a shared solution.

TODO [edit]

The following constructs still need a RDFa markup definition. They will initially only be marked with typeof="mw:Placeholder" for simple read-only round-tripping.

  • template parameter references (implemented, but not tested much)
  • Spec versioning: Add an attribute on the body element or in the head that spells out the DOM spec revision. This allows us to evolve the DOM spec while still correctly reading older HTML revisions. We can also convert them to the latest verision on the fly.
  • __TOC__
  • <section> extension as HTML5 sections (see bug 47936).