User:Brooke Vibber (WMF)/Linter alt text check 2024

From mediawiki.org

Dev diary[edit]

2024-02-09[edit]

Been working on a "low-priority" lint check for missing alt text that can appear in Extension:Linter's charts and through specialized querying for finding available microcontributions in formatting cleanup for mobile apps.

This requires touching several distinct systems:

  • Parsoid needs to be extended to either include this check in its lints, or be extensible so a MediaWiki extension can reasonably add it
    • core needs to load a suitably extended parsoid, which is pulled in via composer making it harder to work with
  • Extension:Linter needs to be extended to store data for that lint, including assigning an enumeration id and adding localized labels for the user interface
  • Potentially a second extension with a custom API, or a modification to Linter's API, could better serve out "give me a nicely randomized work queue item without unsightly duplicates"

Currently I've gotten bogged down with actually testing any of my code because Extension:Linter records no linting data; as far as I can tell the linter bits in Parsoid are never even called.

Even with a stock install of Extension:Linter with stock MediaWiki core and Parsoid via composer, nothing gets recorded when adding unbalanced HTML to pages. There are notes on the talk page about the recommended configuration not working as of a year ago on various third-party sites, and some references to alternate configurations that refer to a Parsoid rest service, but it's kind of unclear what's going on.

Next steps: clean up dev tree and ask for help from content-transform team. ;)

2024-02-13[edit]

Got pointed in the right direction for running the lints (the default local configuration doesn't force a parsoid parse after edit, so even a VE edit wouldn't automatically fill the linter! a manual reload into the editor without saving, or forcing parses with ParserMigration, helps with this).

Patches for parsoid and extension:linter in gerrit!

Cleaned up and added test cases in response to initial feedback.

Note there's a potential decision point in that we could decide to also emit a lint warning on explicitly set empty alt text; currently it will pass through an empty but present alt attribute as set via alt= on the wikitext file invocation.

Folks in content-transform team don't envision a major performance problem from it matching on many pages, so it's mainly just process/workflow to double-check w/ users.

Content-transform are also interested in generalizing the linter API to make it pluggable, so we don't have to update Parsoid to introduce new links. If we add future checks like this we should invest the time to do that, it will simplify maintenance and deployment greatly.