Jump to content

Edit check/Status/2025

From mediawiki.org

Edit check updates:

[edit]

Paste Check

Diagram outlining the ways people can respond to Paste Check and what will happen as a result.
Proposed Paste Check user experience.

The Team is returning to work on Paste Check. The focus right now is clarifying what actions and instructions Paste Check will present to people.

Seeking feedback: we are particularly interested in learning what adjustments you think would be worthwhile to make to copy within the experience. Please let us know on the talk page.

[edit]

In March 2025, we began an A/B test that removes the constraint on how many Reference Checks people have the potential to see within a single edit session.

This experiment is now complete and the results from it are now finalized. In short, this change caused statistically significant increases in the proportion of edits that include a reference and are not reverted within 48 hours without causing undesirable changes in edit completion or other forms of disruption (e.g. blocks).

You can review the findings below and see the full analysis report here.

[edit]

Peacock check becomes Tone Check, model test to start soon

Over time, the scope of the peacock model has grown to the point that the model is now detecting more generic tone issues.

This, combined with how difficult it is likely to be for volunteers to translate "Peacock" into other languages led the Editing and Machine Learning Teams to rename "Peacock Check" to "Tone Check."

We think "Tone Check" will still be specific enough for volunteers to understand its meaning while also being open-ended enough to allow the model to evolve through future trainings.

Tone check will use a model, trained by volunteers. This first test will be conducted for English, Spanish, Portuguese, French and Japanese languages. People are invited to signup at Edit check/Tone Check/model test before May 23rd.

[edit]

Peacock check model test to start soon

Peacock check will use a model, trained by users. We will soon start reaching at a few communties (listed at T388471) and ask for volunteers to test the model.

We selected the wikis based on several criteria:

  • Technical reasons:
    • Wikis that use the “variants” feature (like Chinese) — because the model has to infer across different language varieties
    • Languages that don’t space-separate words (like Chinese, Japanese) — where the results will be very dependent on the tokenizer
    • Agglutinative languages (like Turkish, Indonesian) — where the model will be very dependent on the tokenizer
    • RTL (like Arabic, Hebrew) — because of potential user experience (UX) issues
  • projects that see relatively high volumes of newcomers, specifically in Sub-Saharan Africa, and
  • projects that have expressed a willingness to experiment with Peacock Check.

[edit]

A/B Test: Multiple Reference Checks

An analysis of leading indicators for the Multiple Reference Check A/B test is complete and the results are encouraging:

  • New(er) volunteers are encountering Multiple Reference Checks in enough editing sessions to draw statistically significant conclusions from
  • People shown multiple Reference Checks within an edit session are proceeding to publish edits that include ≥1 reference at relatively high rates
  • Showing people multipole Reference Checks within an edit session is not leading to increases in revert rate or blocks

You can review these results in more detail below.

For context, leading indicator analyses of this sort are meant to uncover what – if any – adjustments we will consider prioritizing before evaluating the broader impact of the feature in question.

[edit]

Peacock Check

Screenshot showing a work in progress design of Peacock Check.
Work in progress design of Peacock Check.

In collaboration with the Machine Learning team, the Editing team has started working on a new check: Peacock check (T368274) This check will detect the usage of puffery terms, and encourage the user to change them.

We are currently gathering on-wiki policies, templates used to tag non-neutral articles, and the terms (jargon) used in edit summaries for 20 wikis.

A/B Test: Multiple Reference Checks

Yesterday (25 March), an A/B test began at 12 Wikipedias that removes the constraint on how many Reference Checks people have the potential to see within a single edit.

Note: at present, a maximum of one Reference Check is shown per edit.

This experiment is an effort to learn: What – if any – changes in edit quality and completion do we observe when people have the potential to see multiple Reference Checks within a single edit?

The findings from this A/B test will be relevant for the near-term future where multiple Edit Checks of the same and/or different types (e.g. Peacock Check, Paste Check, etc.) have the potential to become activated within a single edit session.

[edit]
Line chart showing daily revert rate for new content edits where Reference Check is shown.
Daily revert rate for new content edits where Reference Check is shown.

Curious to learn if and how the new desktop Edit Check experience might have impacted edit quality, we investigated how the number of reverted edits changed before and after the December 2024 release.

We learned:

  1. The revert rate of new content edits where Reference Check was presented decreased by 15.7% (20.4% pre- to 17.2% post-change).
  2. There was a 8% increase in the proportion of new content edits that included a reference following the change (34.8% pre-change to 37.9% post-change).

See full report for more details

[edit]

In December, we released a new design for the Edit Check desktop experience.

Last week, we finished an analysis that compared how – if at all – several key metrics shifted before and after this change.

The purpose of this analysis: decide whether there were any changes to the user experience we ought to prioritize making before beginning an A/B test that will evaluate the impact of it being possible for multiple Reference Checks to be shown within a single edit.

You can find a summary of the results below and the full report here.


Edit check updates: