Moderator Tools/Automoderator/Measurement plan
This is a summary of the current draft of the Automoderator measurement plan, outlining how we will evaluate whether the project is successful in meeting its goals, and to understand what impact it is having on Wikimedia projects.
The page is divided into three hypotheses we have about Automoderator. Each hypothesis has two top-level data points (the most important numbers we're interested in) followed by a table detailing our current research questions and the evaluation methods or metrics we'll use to test them. The research questions are informed both by our internal discussions on the project, and conversations we have had with editors (e.g. here on MediaWiki).
This document is not fixed or final and will change as we learn more. Unfortunately we can't guarantee that this page will stay up to date following the initial community discussions we have about it. We may find that some questions are not feasible to answer with the available data, or might identify new questions we have further down the line. We aim to share any major changes in project updates.
We really want to know what you think about this plan on the project talk page - does this capture the main data points you think we should track? Is anything missing or do you have ideas we could incorporate? What data would help you decide whether this project was successful?
QN = Quantitative measure (data)
QL = Qualitative measure (e.g. surveys, unstructured feedback)
Hypothesis #1
[edit]Hypothesis: Automoderator will extend the reach of patrollers by reducing their overall workload in reviewing and reverting recent changes, and effectively enabling them to spend more time on other activities.
Top level data:
- Automoderator has a baseline accuracy of 90%.
- Moderator editing activity increases by 10% in non-patrolling workflows (e.g. content contributions or other moderation processes).
Research questions | Evaluation method/metric(s) | Notes |
---|---|---|
Is automoderator effective in countering vandalism on wikis?
|
[QN] While the thresholds for success can vary based on the community, the team would consider the following as successes:
|
We don't yet know what a reasonable level of coverage is for Automoderator, so we will define X as we progress with the project.
Each community will be able to customise the accuracy and coverage level for their community, so 90% would be a baseline figure applying to the most permissive option available. |
[QN] How long does vandalism stay in articles before being reverted, and how many readers see that vandalism.
|
Pageview data is not currently available on a per-revision basis, but this is something we can start collecting (T346350). | |
Does Automoderator reduce the workload of human patrollers in countering vandalism? | [QN] Proportion of edits reverted back by Automoderator, human patrollers, and tool assisted human patrollers across the time periods of 1 hr, 8 hrs, 24 hrs, and 48 hrs, after an edit takes place. | 'Tool assisted human patrollers' means patrollers using tools like Huggle and SWViewer. |
[QN/QL] Does the volume of various content moderation backlogs reduce?
|
Here we are hypothesising that patrollers might spend their additional time in other venues.
We may need to start with some qualitative research here to understand which backlogs we can/should monitor. | |
Does Automoderator help patrollers spend their time on other activities of their interest?
|
[QN] Distribution of contributions/actions (pre and post deployment) by patrollers across:
Tentative list of contributions
The patrollers of the pilot wikis will be surveyed to
|
There are a wide range of possible ways to look at this, so we may need to speak to patrollers to understand which activities to consider. |
[QL] Perception of patrollers in how they are contributing to the wiki post-deployment.
Qualitative changes in workflows compared to pre-Automoderator deployment. As in - are they actually doing non-patroller work or simply more specialized patroller work that Automoderator can’t handle? |
Hypothesis #2
[edit]Hypothesis: Communities are enthusiastic to use and engage with Automoderator because they trust that it is effective in countering vandalism.
Top level data:
- Automoderator is enabled on two Wikimedia projects by the end of FY23/24 (June 2024).
- 5% of patrollers engage with Automoderator tools and processes on projects where it is enabled.
Research questions | Evaluation method/metric(s) | Notes |
---|---|---|
Are communities enthusiastic to use Automoderator? | [QL] Sentiment towards Automoderator specifically and/or automated moderation tools broadly, both among administrators and non-administrator editors.
[QL] Presence of custom documentation for Automoderator (e.g. guidance or guidelines on use) [QL] Uptake of Automoderator by specialized counter-vandalism groups (especially crosswiki ones) - stewards, global sysops, SWMT [QN] String (TranslateWiki) and documentation (MediaWiki) translation activity. |
|
[QN] Do communities enable Automoderator, and keep it enabled? If so, how long?
|
||
Are communities actively engaging with Automoderator because they believe it is an important part of their workflows? | Note: may change based on the final design/form Automoderator takes
[QN] What proportion of false positive report logs are reviewed and are yet to be reviewed? |
|
Note: may change based on the final design/form Automoderator will take
[QN] What is the usage of model exploration/visualisation tools?
|
||
Note: may be expanded based on the final design/form Automoderator will take
[QN] How often is Automoderator’s configuration adjusted?
|
This may only be relevant when Automoderator is initially enabled and configured. After this we may not expect high activity levels. | |
Are communities able to understand the impact of Automoderator on the health of their community? | [QL] UX testing of Automoderator configuration page and dashboards (if relevant) | On our first pilot wikis we may need to simply have a json or similar page, before Community Configuration is ready to provide a better front-end experience. |
Hypothesis #3
[edit]Hypothesis: When good faith edits are reverted by Automoderator, the editors in question are able to report false positives, and the revert actions are not detrimental to the editors’ journey, because it is clear that Automoderator is an automated tool which is not making a judgement about them individually.
Note: As editors’ experiences and journeys widely vary based on device, the following metrics where relevant should be split by platform and device.
Top level data:
- 90% of false positive reports receive a response or action from another editor.
Research questions | Evaluation method/metric(s) | Notes |
---|---|---|
Are good faith editors aware of the reverts made by Automoderator and able to report if they believe it is a false positive? | [QL/QN] What is the perception of good faith newcomers when their edit has been reverted by Automoderator?
|
This may be a survey, interviews, or using QuickSurveys. |
Are users who intend to submit a false positive report able to successfully submit one? | [QN] What proportion of users who have started the report filing process completed it?
[QL] UX testing of the false positive reporting stream. |
|
What is the effect of Automoderator in new editors’ contribution journey?
|
[QN] A/B experiment: Automoderator will randomly choose between taking and not taking a revert action on a newcomer (details to be defined). The treatment group will be newcomers on whom Automoderator takes a revert action on, and the control group will be newcomers on whom Automoderator should have taken a revert action on (based on the revert risk score) but hasn't, as part of the experiment, but were later taken action on by human moderators.
[QL] Quicksurveys or similar short survey tool may be feasible.
|
Retention and surveying new editors is hard, but we have a lot of experience with this at the Wikimedia Foundation in the Growth team. We will be meeting with them to learn more about the options we have for evaluating this research question. |
Guardrails
[edit]In addition to this goal-focused measurement plan, we are also planning to define 'guardrails' - metrics that we will monitor to ensure we're avoiding negative impacts of Automoderator. For example, do fewer new editors stick around because Automoderator reverts are frustrating, or do patrollers become too complacent because they put too much trust in Automoderator? These guardrails have not yet been documented, but we'll share them here when they have.
If you have thoughts about what could go wrong with this project, and data points we could be monitoring to verify these scenarios, please let us know.
Pilot phase metrics
[edit]While the measurement plan can be helpful to understand and evaluate the impact of the project in the long term, we have identified some metrics to focus on for the pilot phase. The goal of these is to provide an overview of Automoderator's activity to the team and also the community, and monitoring to making sure that nothing abnormal. If you have suggestions for any other metrics that we should be tracking during the pilot phase, please leave a message on the talk page.
Indicator for | Metric(s) | Dimensions |
---|---|---|
Volume | Number of edits being reverted by Automoderator (absolute & percentage of all reverts) | Anonymous users, newcomers[1], non-newcomers[2] |
Accuracy (False positives) | Percentage of Automoderator's reverts reverted back | |
Accuracy (False negatives) | Proportion of reverts not performed by Automoderator while it is turned on | - |
Efficiency | Average time taken for Automoderator to revert an edit | - |
- | Average time taken for Automoderator's reverts to be reverted back | - |
Guardrail | Post deployment, proportion of edits reverted by performer | Automoderator, humans, and tool-assisted humans (if applicable) |