Manual:$wgUseTidy

Details
Use "HTML tidy" to make sure HTML output is sane.

HTML tidy is a free tool that fixes broken HTML. See HTML tidy and http://www.w3.org/People/Raggett/tidy/

You may wish to setup this tool, and set $wgUseTidy=true, to ensure that the wiki outputs reasonably clean and compliant HTML, even when malicous or foolish users add corrupt/badly formatted HTML to wiki pages.

Note that MediaWiki already does some built-in checks and corrections to user's HTML, and limits the range of html tags and attributes which can be used (unless you set $wgRawHtml=true Dangerous!) Limitations are described at meta:Help:HTML in wikitext. The logic for this is found in includes/Sanitizer.php. As such, you may decide that running HTML tidy over the output is not necessary.

Configuration
The location of the tidy configuration file can be set using Manual:$wgTidyConf - before MediaWiki 1.10, this was required. In later versions, a working default is provided.

1.11 and Parser_OldPP
In the MediaWiki preprocessor in versions 1.11 and earlier (and optionally available via $wgParserConf in 1.12). Tidy is required for complex HTML tag hierarchy in wikicode. HTML tags that are distributed (via transclusion), conditional (via parserfunctions) and mixed (wikicode/HTML) become escaped without Tidy. In the new preprocessor these are handled correctly without Tidy. Examples:

All versions
Tidy is still required to mix wiki table and html table syntax, as well as simple wikicode and html-style markup.


 * Mixed open/close tags.
 * '''foo
 * foo&amp;lt;/b&amp;gt;
 * foo
 * Definition list nesting
 * ; hi
 * one
 * Definition list nesting
 * ; hi
 * one
 * one

 one  hi one </dl>
 *  hi
 * }

Tidy can correct most bad HTML, which can be bad user input like or conflicting or badly written extensions (and even some bugs in the core software). However, it does not resolve all strict XHTML validation issues, such as duplicate xml ID attribute values, or IDs starting with numbers.