Moderator Tools/Automoderator/Testing

The Moderator Tools team is building Automoderator - a tool which can automatically revert bad edits based on a machine learning model, performing a similar function to community anti-vandalism bots such as ClueBot NG, SeroBOT, and Dexbot. To help communities test and evaluate Automoderator's accuracy, we are making a test spreadsheet available with data on past edits and whether Automoderator would have reverted them or not.

Automoderator’s decisions result from a mix of a machine learning model score and internal settings. While the model will get better with time through re-training, we’re also looking to enhance its accuracy by defining some additional internal rules. For instance, we’ve observed Automoderator occasionally misidentifying users reverting their own edits as vandalism. To improve, we’re seeking similar examples and appreciate your assistance in identifying them.

How to test Automoderator

 * If you have a Google account:
 * Make a copy of this spreadsheet by clicking File > Make a Copy ...
 * After your copy has loaded, click Share in the top corner, then give any access to avardhana@undefinedwikimedia.org, so that we can aggregate your responses to collect data on Automoderator's accuracy.
 * Alternatively, you can change 'General access' to 'Anyone with the link' and share a link with us directly on-wiki.
 * Alternatively, use this link to download the file.
 * After adding your decisions, please send the sheet back to us at avardhana@undefinedwikimedia.org, so that we can aggregate your responses to collect data on Automoderator's accuracy.

After accessing the spreadsheet...

TODO - Note about other languages can be added
 * 1) Follow the instructions in the sheet to select a random dataset, review 30 edits, and then uncover what decisions Automoderator would make for each edit.
 * 2) Feel free to explore the full data in the 'Edit data & scores' tab.
 * 3) If you want to review another dataset please make a new copy of the sheet to avoid conflicting data.
 * 4) Join the discussion on the talk page.

About Automoderator
The Automoderator’s model is trained exclusively on Wikipedia’s main namespace pages, limiting its dataset to edits made to Wikipedia articles. Further details can be found below:

Internal configuration
In the current version of the spreadsheet, in addition to considering the model score, Automoderator does not take actions on:


 * Edits made by administrators
 * Edits made by bots
 * Edits which are self-reverts
 * New page creations

The datasets and list above will be updated as testing progresses if we add new exclusions or configurations.

Caution levels
In this test Automoderator has five 'caution' levels, defining the revert likelihood threshold above which Automoderator will revert an edit.


 * At high caution, Automoderator will need to be very confident to revert an edit. This means it will revert fewer edits overall, but do so with a higher accuracy.


 * At low caution, Automoderator will be less strict about its confidence level. It will revert more edits, but be less accurate.

The caution levels in this test have been set by the Moderator Tools team based on our observations of the models accuracy and coverage. To illustrate the number of reverts expected at different caution levels see below:

TODO - https://phabricator.wikimedia.org/T348869

Score an individual edit
If you want to get a Revert Risk score for an individual edit, you can do so with the LiftWing API ... TODO

https://en.wikipedia.org/wiki/User:Samwalton9_(WMF)/revertrisk.js

Note that this is just the model score, and does not take into account Automoderator's internal configurations as detailed above.

Further details