Help:Extension:Linter

The  extension identifies wikitext patterns that must or can be fixed in pages along with some guidance about what the issues are with those patterns and how to fix them.

The Special:LintErrors page groups the errors by type. Some of these issues may be easier to find with Special:Expandtemplates. On this page, we will classify lint issues according to the severity of the issue vis-a-vis goals that are blocked by those issues. More information and discussion about this is provided further below.

There is still a bunch of work to do to eliminate noise, fix bugs, and make the linter output more actionable. So, this should be considered Work In Progress at this time.

deletable-table-tag
Help:Extension:Linter/deletable-table-tag

pwrap-bug-workaround (yet to be deployed)
Help:Extension:Linter/pwrap-bug-workaround

misnested-tag
Help:Extension:Linter/misnested-tag

Why and what to fix
Going forward, the parsing team plans to leverage the Linter extension to identify wikitext patterns: Not all of them need to be fixed promptly or even ever (depending on your tolerance for lint). Different goals are advanced by fixing different subsets of the above lint issues. We (the parsing team) will try to be transparent about these goals and fixing which issues advance which goals.
 * that are erroneous (ex: bogus image options – usually caused by typos or because media option parsing in MediaWiki is fragile).
 * that are deprecated (ex: self-closing tags)
 * that can break because of changes to the parsing pipeline (ex: replacing Tidy with RemexHTML)
 * that are no longer valid in HTML5 (ex: obsolete tags like center, font)
 * that are potentially broken and can be misinterpreted by the parser compared to what the editor intended them to be (ex: unclosed HTML tags, misnested HTML tags)

Goal: Replacing Tidy
As part of addressing technical debt in the parsing pipeline of MediaWiki, we have been working to replace Tidy with a HTML5-based tool. However, doing so will break the rendering of a small subset of pages unless certain wikitext patterns are fixed. Specifically, issues found in the,  , and   categories. In order to ensure that we don't stretch out this Tidy replacement too long, we have accordingly marked all these issues as high priority.

Goal: Improving compliance between rendering of the PHP parser and Parsoid
Right now, the HTML generated by the PHP parser is used for read views and the HTML generated by Parsoid is used by editing tools and the Android app among others. The parsing team has, as one of its long-term objectives, to use Parsoid's output for both read views as well as for editing. Since Parsoid and RemexHTML are both HTML5-based tools, the lint categories that affect RemexHTML's rendering also affect Parsoid's rendering. We haven't yet identified any newer lint issues that affect Parsoid's rendering at this time, but will update this list as we identify any such.

Goal: HTML5 output compliance
This is a somewhat complex goal and we haven't yet arrived at an understanding about how important it is to pursue this goal or how far we should go with this. Additionally, it is not yet clear what mechanisms we wish to leverage towards this goal. For example, based on a bunch of discussion in different venues, User:Legoktm/HTML+MediaWiki outlines a proposal for handling the html5-deprecated big tag. In any case, fixing issues in the,   categories advance this goal. Given lack of clarity around this goal, we have accordingly marked the obsolete-tag category as a low-priority goal.

Goal: Clarifying editor intent
Getting markup right is hard. Errors inadvertently creep through. While the parser does its best in recovering from these errors, in many cases, what the parser does might not truly reflect the editor's original intent. Given that, we recommend that it is best to fix the issues identified here to clarify the editor's intention. Issues in the,  ,  ,   categories seem to affect this goal. Since this is a fairly important goal, we have marked most of them with medium priority. However, we have marked the missing-end-tag category with a low priority since in a vast majority of cases, the parser does seem to recover fairly accurately. Nevertheless, we recommend fixing whatever can be fixed without too much effort, if only to assist comprehension by other human editors and tools.

Goal: Clean markup
Getting markup right is hard. Even in the presence of errors, the parser does a fairly decent job in most cases in figuring out accurately how that piece of markup is supposed to render. But, in much the same way that typos, punctuation and minor grammatical errors can feel unsettling, some editors or those with a developer-mindset might find lint issues in these categories unsettling. We don't recommend spending an inordinate amount of time fixing these issues and, in many scenarios, bots might be able to fix these up as well. ,,   lint categories affect this goal.