Help:Extension:Linter

From mediawiki.org
Revision as of 20:40, 24 April 2017 by SSastry (WMF) (talk | contribs) (First pass providing improved guidance around fixing lint issues.)

The Linter extension identify wikitext patterns that must or can be fixed in pages along with some guidance about what the issues are with those patterns and how to fix them.

The Special:LintErrors page groups the errors by type. Some of these issues may be easier to find with Special:Expandtemplates. On this page, we will classify lint issues according to the severity of the issue vis-a-vis goals that are blocked by those issues. More information and discussion about this is provided further below.

There is still a bunch of work to do to eliminate noise and make the linter output more actionable. More guidance will be provided about things that should definitely be fixed. So, this should definitely be considered Work In Progress at this time.

High priority lint issues

deletable-table-tag

Help:Extension:Linter/deletable-table-tag

pwrap-bug-workaround (yet to be deployed)

Help:Extension:Linter/pwrap-bug-workaround

self-closed-tag

Help:Extension:Linter/self-closed-tag

Medium priority lint issues

bogus-image-options

Help:Extension:Linter/bogus-image-options

fostered

Help:Extension:Linter/fostered

misnested-tag

Help:Extension:Linter/misnested-tag

Low priority lint issues

missing-end-tag

Help:Extension:Linter/missing-end-tag

stripped-tag

Help:Extension:Linter/stripped-tag

obsolete-tag

Help:Extension:Linter/obsolete-tag

Why and what to fix

Going forward, the parsing team plans to leverage the Linter extension to identify wikitext patterns:

  • that are erroneous (ex: bogus image options -- usually caused by typos or because media option parsing in MediaWiki is fragile).
  • that are deprecated (ex: self-closing tags)
  • that can break because of changes to the parsing pipeline (ex: replacing Tidy with RemexHTML)
  • that are no longer valid in HTML5 (ex: obsolete tags like center, font)
  • that are potentially broken and can be misinterpreted by the parser compared to what the editor intended them to be (ex: unclosed HTML tags, misnested HTML tags)

Not all of them need to be fixed promptly or even ever (depending on your tolerance for lint). Different goals are advanced by fixing different subsets of the above lint issues. We (the parsing-team) will try to be transparent about these goals and what fixing what issues advance what goals.

Goal: Replacing Tidy

As part of addressing technical debt in the parsing pipeline of MediaWiki, we have been working to replace Tidy with a HTML5-based tool. However, doing so will break the rendering of a certain small subset of pages unless certain wikitext patterns are fixed. Specifically, issues found in the deletable-table-tag, pwrap-bug-workaround, and self-closed-tag categories. In order to ensure that we don't stretch out this Tidy replacement too long, we have accordingly marked all these issues as high priority issues.

Goal: Improving compliance between rendering of the PHP parser and Parsoid

Right now, the HTML generated by the PHP parser is used for read views and the HTML generated by Parsoid is used by editing tools and the Android app among others. The parsing team has, as one of its long-term objectives, to use Parsoid's output for both read views as well as for editing. Since Parsoid and RemexHTML are both HTML5-based tools, the lint categories that affect RemexHTML's rendering also affect Parsoid's rendering. We haven't yet identified any newer lint issues that affect Parsoid's rendering at this time, but will update this list as we identify any such.

Goal: HTML5 output compliance

This is a somewhat complex goal and we haven't yet arrived at an understanding about how important it is to pursue this goal or how far we should go with this. Additionally, it is not yet clear what mechanisms we wish to leverage towards this goal. For example, based on a bunch of discussion in different venues, User:Legoktm/HTML+MediaWiki outlines a proposal for handling the html5-deprecated big tag. In any case fixing issues in the obsolete-tag, self-closed-tag categories advance this goal. Given lack of clarity around this goal, we have accordingly marked the obsolete-tag category as a low-priority goal.

Goal: Clarifying editor intent

Getting markup right is hard. Errors inadvertently creep through. While the parser does its best in recovering from these errors, in many cases, what the parser does might not truly reflect the editor's original intent. Given that, we recommend that it is best to fix the issues identified here to clarify the editor's intention. Issues in the bogus-image-options, fostered-content, misnested-tag, missing-end-tag categories seem to affect this goal. Since this is a fairly important goal, we have marked most of them with medium priority. However, we have marked the missing-end-tag category with a low priority since in a vast majority of cases, the parser does seem to recover fairly accurately. Nevertheless, we recommend fixing whatever can be fixed without too much effort, if only to assist comprehension by other human editors and tools.

Goal: Clean markup

Getting markup right is hard. Even in the presence of errors, the parser does a fairly decent job in most cases in figuring out accurately how that piece of markup is supposed to render. But, in much the same way that typos, punctuation and minor grammatical errors can feel unsettling, some editors or those with a developer-mindset might find lint isuses in these categories unsettling. We don't recommend spending an inordinate amount of time fixing these issues and In many scenarios, bots might be able to fix these up as well. misnested-tag, missing-end-tag, stripped-tag lint categories affect this goal. [[Category:Extension helpTemplate:Translation]]