Topic on Extension talk:Linter

Suggestion: API for fetching lint errors for a specific revision

3
Summary by Elitre (WMF)
197.218.80.192 (talkcontribs)

Use cases:

User - interested in finding out on average how many lint errors were added to revisions of a specific article (perhaps because there is complicated markup there).

Researcher - interested in looking at the burden (e.g. cleanup efforts) new or older editor cause other editors.

Tool / script developer - interested in finding out in which revision an error was introduced to revert or to identify the culprit.

Use in extensions - for example, recentchanges could theoretically flag every revision that contains a lint error.

Background

Even in its current state the extension can make it possible to do a lot of analysis on existing data. In addition to the use cases presented above, one could for example look into historical data, e.g. run extra analysis on theResearch:VisualEditor%27s_effect_on_newly_registered_editors/June_2013_study dataset to evaluate the number of errors introduced by VisualEditor vs Wikitext editors on page creation or just wikitext editor errors.

This may also be used in the ORES tool (which works on revisions) by giving extra information that can be used to help identify possible revisions containing vandalism (vandals might generally not know wikitext markup).

One possibility would be for an individual to look at their own contributions, and evaluate whether there are patterns of incorrect markup they leave behind that they can improve on. This could also be used by editors to either see if a newbie needs help, or to identify a possible vandal.

Proposed solution

A new api endpoint , e.g.:

api.php?action=query&revids=478198|54872|54894545&prop=linterrors&leprop=count|type|...

Unlike fetching lint errors for arbitrary text (which is useful by itself), this allows for much more flexibility and analysis, without using database dumps or complex scripts.

Elitre (WMF) (talkcontribs)
SSastry (WMF) (talkcontribs)

The specific form of this is a bit harder to support since linter is backed by parsoid right now. So, may be a separate endpoint similar to T163091 ... but this will be a lower priority one. But, sure file the suggestion in Phab. Good to have it there rather than be lost on wiki. Thanks!