Requests for comment/Update our code to use RDFa 1.1 instead of RDFa 1.0

From mediawiki.org
Request for comment (RFC)
Update our code to use RDFa 1.1 instead of RDFa 1.0
Component General
Creation date
Author(s) Daniel Friesen
Document status stalled
See Phabricator.

Currently it appears that MediaWiki's RDFa support is based on RDFa 1.0, more specifically on documents written back in 2008. RDFa 1.1 has been a W3C Recommendation for almost an entire year now (the recommendation was published June 2012) and has existed since long before then (earliest working drafts date back to 2010).

Wikimedia wikis don't even have RDFa enabled. VisualEditor is using RDFa output by Parsoid, the full source is not visible in the surface but the Parsoid syntax is based on RDFa 1.1 not RDFa 1.0 (They use prefix="" internally which is part of RDFa 1.1). And some of the fundamental features of RDFa are not even supported by our current code (see the extra notes).

Unless someone steps forward saying that they have already enabled MW RDFa on their wiki and are heavily dependent on RDFa 1.0 things that would be broken by switching to RDFa 1.1. Or someone steps forward with evidence that the major players that would use RDFa we are trying to cater to only support RDFa 1.0 and not RDFa 1.1. Then it would probably be best for us to update our code to be based on RDFa 1.1 and not RDFa 1.0.

  • Defining prefixes using xmlns:foo="http://example.org/" is deprecated. It is recommended to use prefix="foo http://example.org/" instead.
  • Attributes like datatype="" that only accepted CURIEs in RDFa 1.0 now also accept an IRI.
  • datatype, property, typeof, rel, and rev now support the notion of a term. You define a vocab like vocab="http://example.org/" and then terms which are plain non-CURIE non-IRI strings like "foo" with no : in them will be expanded to "http://example.org/foo".
  • inlist="" can be used to define rdf:first/rdf:rest style ordered lists in RDFa markup.
  • Definition of version="" has been removed from RDFa Core 1.1.
    • XHTML+RDFa 1.1 defines RDFa 1.1 inside XHTML 1.1. It states 'There may Be a @version attribute on the html element with the value "XHTML+RDFa 1.1".'
    • HTML+RDFa 1.1 defines RDFa 1.1 inside HTML 4, HTML 5, and XHTML5. It states:
      • "XML mode XHTML5+RDFa 1.1 documents [...] must not use a DOCTYPE declaration for XHTML+RDFa 1.0 or XHTML+RDFa 1.1, and should not use the @version attribute." meaning:
        • A special DOCTYPE like the ones defined in RDFa 1.0 must not be used.
        • XHTML5 documents aren't supposed to use a version="" attribute without a valid reason to do under special circumstances and a full understanding of the implications.
      • "The @version attribute is not supported in HTML5 and is non-conforming."
    • Overall the idea seems to be that at least for HTML5 you shouldn't be defining an RDFa version="" anymore. And even in XHTML 1.1 it's not something you need anymore.

Changes we would need to make[edit]

  • Our code still uses xmlns for RDFa prefixes. We would first need to whitelist the prefix attribute. After that we should consider deprecating xmlns or remove it entirely if we don't have anyone seriously using RDFa with it.
    • If someone in the future comes forward with a desire to output RDFa 1.0 and a strong reason to do so. We should probably just have a switch to output xmlns based on prefix="" instead of adding xmlns back to WikiText.
  • The vocab and inlist attributes would need to be whitelisted to support terms and ordered lists.

Extra notes[edit]

Most of these RDFa support issues apply to support of both RDFa 1.0 and RDFa 1.1.

  • We do not currently support the content="" attribute which RDFa permits on any element. This is an important part of RDFa we should try to support by whitelisting the attribute. Without content="" you are forced to use the human readable textual contents of the element as the value.
  • We do not currently support the RDFa rel and rev attributes.
    • rel="" is a fundamental feature that introduces chaining. While property can be used to define properties on individual attributes instead of rel. rel="" can be used in a container so that in child nodes you can define multiple relations using only a single about="". This is much less verbose than if you were to use property="".
    • rev="" does the inverse of rel="" and without it some relations can be really hard to express non-verbosely.
    • There are special cases where using rev="" (and rel="") on the same element as the relation is defined (where you'd usually be able to use property="") but these usually involve markup like <img> that we don't support anyways so we don't have to worry about supporting them in WikiText for now.
    • The primary use case for rel="" and rev="" are for chaining child elements. Given that fact we can probably avoid security considerations by only defining rel and ref on plain elements that wouldn't usually have it. e.g.: <div>, <span>, <blockquote>, <strong>, etc.
    • On second thought, the only thing we really need to do for security is to reject simple terms in rel="" (and perhaps rev="") so that things similar to rel="stylesheet" and other rels defines in HTML can't be used.
  • Like Microdata, RDFa seems to support <meta> and <link> but we only support that for Microdata. Our RDFa support should be expanded to support these elements.

Links[edit]