Talk:HTML5

Text Editor
I always write all my articles in WordPad, and then change the syntax for MediaWiki and copy paste it. I know that in HTML5 the standard word processor functionality can be implemented into MediaWiki. It would also be great to have an easier way of adding pictures. I use a lot of pictures, but still I must keep looking up the syntax. The editor is one of the most important features of any Wiki. To be frank with you, with all due respect, the MediaWiki editor sucks.

Keeping this up to date
Just a little think which I think would help with this page. We should create a table with all goals that we have relating to HTML5, filling it in as we go along. This would help people know what goals have still yet to be completed on making MediaWiki HTML5 compliant. Would this be a welcome additon?

Puremation 13:29, 24 March 2011 (UTC)
 * It's a wiki. Be bold.  If someone dislikes your change, they'll fix it or revert it. —Simetrical (talk • contribs) 01:13, 25 March 2011 (UTC)

A configuration setting?
It would be nice if we had something like a $wgUseHTML5, so other wiki operators can also make the decision easily.--Jasper Deng (talk) 01:23, 8 April 2012 (UTC)
 * We have $wgHtml5, is that what you're after? Cheers, Grunny  ( talk ) 05:10, 8 April 2012 (UTC)
 * Yeah, that was what I was after. I'm going to add mention of this to the page.--Jasper Deng (talk) 05:10, 8 April 2012 (UTC)

Data-* in MediaWiki 1.16
Is it possible to enable data-* attributes in 1.16?

I tried adding this to validateAttributes in sanitizer.php

I also tested adding 'data-anything' to the common whitelist but no attributes with hyphens work. Is there something else I need to do to keep attributes with hyphens from getting stripped out?

—The preceding unsigned comment was added by JuLara (talk • contribs) 18:17, 31 August 2012 (UTC)

Named entities
I'd suggest continuing to allow the use of the legacy HTML4 named entities in wikitext, as things like &amp;nbsp; are just too convenient to lose -- but of course they should be converted to the appropriate numeric entities on conversion to HTML, in order to make the page itself valid HTML5. -- The Anome (talk) 15:48, 7 September 2012 (UTC)


 * MediaWiki currently parses all the 253 named character entities in HTML4/XHTML1 (including the non-HTML4 ), outputting them as named entities (for < and > output as   and  ), decimal entities (for &amp;nbsp;, output as  ) or UTF-8 literals – see Santizer.php for the named entity list.
 * Surely this will continue unaffected by activating HTML5 mode? (Otherwise, the disruption would be immense.)
 * However, the parser does not recognise the 1,978 entities first added with HTML5 (including 106 legacy-compatibility entities without a trailing semicolon or with variant capitalization), so the ampersand gets escaped and the named entities appear as in the wikitext.
 * For example: the HTML4 entity for U+00C0 (#192, À, ) is rendered &Agrave; (source:  ) but the new HTML5-legacy-compat entities for U+003E (  without a semicolon, and uppercase   and  ) are escaped to render literally (as &gt, &GT;, &GT) and the new HTML5 entity for U+0102 (#258, Ă)   is rendered &Abreve;.
 * {| class="wikitable"

! Codepoint !! UTF-8 !! Decimal   !! Hex                !! Output    !! Named
 * U+0022 || " || &amp;#34; &#34;  || &amp;#x22; &#x22;   || "          || &amp;quot; &quot;
 * U+0027 || ' || &amp;#39; &#39;  || &amp;#x27; &#x27;   || '          || &amp;apos; &apos;
 * U+003C || < || &amp;#60; &#60;  || &amp;#x3c; &#x3c;   || &amp;lt;   || &amp;lt; &lt;
 * U+003E || > || &amp;#62; &#62;  || &amp;#x3e; &#x3e;   || &amp;gt;   || &amp;gt; &gt;
 * U+00A0 ||   || &amp;#160; &#160; || &amp;#xa0; &#xa0;  || &amp;#160; || &amp;nbsp;
 * U+00C0 || À || &amp;#192; &#192; || &amp;#xc0; &#xc0;  || À          || &amp;Agrave; &Agrave;
 * U+0102 || Ă || &amp;#258; &#258; || &amp;#x102; &#x102; || Ă         || &amp;Abreve; &Abreve;
 * }
 * I suspect that the omission is an intentional encouragement to use UTF-8 literals in wikitext instead of more opaque encodings.
 * — Richardguk (talk) 22:03, 7 September 2012 (UTC)
 * U+00C0 || À || &amp;#192; &#192; || &amp;#xc0; &#xc0;  || À          || &amp;Agrave; &Agrave;
 * U+0102 || Ă || &amp;#258; &#258; || &amp;#x102; &#x102; || Ă         || &amp;Abreve; &Abreve;
 * }
 * I suspect that the omission is an intentional encouragement to use UTF-8 literals in wikitext instead of more opaque encodings.
 * — Richardguk (talk) 22:03, 7 September 2012 (UTC)
 * I suspect that the omission is an intentional encouragement to use UTF-8 literals in wikitext instead of more opaque encodings.
 * — Richardguk (talk) 22:03, 7 September 2012 (UTC)


 * I think the HTML4 entities are plenty: it's really only very common non-ASCII things like nbsp, &mdash;, &trade;, &copy;, &bull;, &dagger; that people really care about. Particularly nbsp. -- The Anome (talk) 22:06, 10 September 2012 (UTC)