VisualEditor on mobile/VE mobile default

From mediawiki.org

This page talks about the Editing Team's work experimenting with VisualEditor being the default mobile editor on a select number of wikis.

This initiative is a part of our team's larger effort to simplify contributing on mobile, described in the Foundation's 2018-2019 annual plan. And more specifically, to increase the likelihood newer contributors will have success making quick edits on-the-go, by presenting them, by default, with a simpler and more visual editing interface.

Watching this page is a good way to be involved with and stay up-to-date about:

  • Where this change will go into effect
  • How we will measure the impact of this change
  • When this change will happen

If there is anything about this project you want to know, ask and/or talk about, please let us know on the talk page! Our team is eager to hear what you think...

Updates[edit]

06 April 2023: A/B test results[edit]

The Editing Team finished analyzing data from an A/B test of showing visual editor as the default editing interface on mobile to newcomers. The results are summarized below.

2 October 2020: Experiment design and analysis timing[edit]

We have made some changes to the A/B test's design. Those design changes are listed and explained below. We would value knowing what you think of them.

Before that, here is a summary of where we left off in November 2019...

Where we left off

In November 2019, we shared three updates about the status of the A/B test:

  1. We identified a data quality issue that invalidated the results of the initial test
  2. We fixed the issue and planned to re-run the A/B test
  3. We planned to have test results to share in January 2020

While this data quality issue has been fixed and the test has been re-run, we have not yet analyzed the test data.

Changes in test design

The objective of the A/B test has been, and continues to be, to determine what interface when shown as default causes newcomers to have a better editing experience.

In service of the above, we are making two main refinements to the A/B test's design:

  • Definition of "better": we are changing the metric used to determine what the "better" editing interface is to editor retention from edit completion rate.
  • Test population:  the A/B test will be limited to people who have not edited on any platform before. Said another way:  the behavior of people who have edited on desktop or mobile will not be included in this test.

The rationale for these refinements are detailed below.

Rationale for changes in test design

  • Definition of "better"
    • Our priority is the long term health of Wikimedia projects. This "health" depends on people continuing to edit and retention is the best metric we have to measure how likely it is that people who start editing will come back to do it again [and again].
  • Test population
    • The objective of this test is to determine what editing interface when presented as default is more likely to cause new contributors to continue editing Wikipedia. As such, the behavior of people who have edited Wikipedia before, albeit on a different platform, should not influence this decision.

Next steps

We have not yet set dates for when this revised analysis will begin or when you can expect to to see results. When we do, we will post an update to this page.

In the meantime, we would value hearing what shortcomings you see in the revisions we are planning to make to the definition of "better" and to the audience of editors that are included in this analysis.

6 November 2019: Data quality update[edit]

In our last update, we noted an oddity with the A/B test data: a statistically higher number of contributors in the test were being put into the "wikitext test bucket" than were being put into the "visual editor test bucket."

Since then, we:

  1. Uncovered a likely cause of the issue
  2. Came up with a fix for the issue and
  3. Decided next steps

...more details below.

Issue

The two test buckets – default-source and default-visual – had a statistically significant different number of contributors in each one. This makes it difficult to accurately compare the behavior of contributors in one test bucket to the behavior of contributors in the other test bucket.

Cause

After some deeper analysis, we uncovered a likely explanation for the issue: there was a percentage of people who never had their bucket assignment recorded.

Said another way: some contributors involved in the test would attempt to make an edit, be "assigned" to a test bucket, but that bucket assignment was never recorded.

The reason for this issue has to do with when and how the software "records" which test bucket a contributor is assigned to. It turned out that this recording happens after the editor code (mobile wikitext or mobile visual editor) finishes loading. This means in situations where the editor fails to load (e.g. poor network connection) or a contributor cancels their edit, no test bucket assignment is recorded. And because the mobile visual editor code is larger than the mobile wikitext editor code, we can assume a greater number of contributors who would have been placed in the default-visual test bucket, did not have their bucket assignment recorded.

Fix

To fix this issue, recording a contributor's bucket assignment now happens in the code that loads the editor, rather than the editor code itself. This way, the time it takes for the editor (mobile wikitext or mobile VE) to load, no longer will affect whether a contributor's bucket assignment will be recorded or not.

Next steps

In the coming weeks, we will be re-running the A/B test with the fix described in place. This way, we can ensure we have two equally balanced test buckets to compare.

This all means you can expect to see results from this test sometime in January, 2020 [or before].

18 September 2019: Test analysis[edit]

We have a few updates to share about the A/B test...

  1. Data quality: while doing an initial check to make sure the A/B test is functioning as we intended it to, we noticed something: A statistically higher number of contributors in the test were being put into the "wikitext test bucket" than were being put into the "visual editor test bucket." This is odd considering in an A/B test, participants should be randomly and evenly assigned into two test groups. We are investigating further to try to understand what might be causing this issue and what the implications of it are. We will share an update when we have more information.
  2. Results: in the next couple of weeks, we plan to start analyzing the A/B test in full. This means you can expect there to be results to review on this project page in early November, provided the data quality issues are resolved.
  3. Deciding what to do next: after analyzing the results of the A/B test, we will need to make a decision about whether to make VisualEditor the default mobile editing interface. And if so, for contributors that meet what criteria. To help guide our decision making, we drafted a set of scenarios and plans of action (listed below). If you have thoughts on the courses of action we are proposing, please let us know on the talk page.

Test scenarios[edit]

Scenario Description Action(s)
A default-visual has a higher* edit completion rate than default-source Step 1: Create a proposal to make the VisualEditor the default mobile editing interface on all wikis

Step 2: Consult wikis about the results of the A/B tests and request feedback on the proposal

B default-visual has a lower* edit completion rate than default-source Step 1: Leave wikitext as default on all non-test wikis

Step 2: Run additional analysis (qualitative and quantitative) to understand what might be the cause for contributors in the A/B test using wikitext being more likely to complete their edits than contributors in the test using the VisualEditor.

C default-visual and default-source have the same* edit completion rates Step 1: Compare A/B test groups across other metrics (e.g. edit revert rate)

Step 2: Assuming VE does not encourage lower quality edits than those in wikitext, create a proposal to make the VisualEditor the default mobile editing interface on all wikis

Step 3: Consult wikis about the results of the A/B tests and request feedback on the proposal

*Defining "higher" / "lower" / "the same": we will depend on our statistical tests to determine whether a change in the average contributor's ECR is large enough to be considered significantly higher or lower. Specifically, we plan to use a Mann–Whitney U test or, if time allows, a multilevel model incorporating per-user and per-wiki fixed effects.

8 July 2019: Test starting tomorrow[edit]

Tomorrow, 9 July, the A/B test that will trial visual editor as the default editing interface on mobile will officially begin. This means, 50% of contributors to the 20 participating wikis who meet the criteria listed below, will see the visual editor after tapping edit on mobile and 50% of contributors will see wikitext after tapping edit on mobile.

Test criteria[edit]

All contributors to the 20 participating wikis, registered and unregistered, who have made less than 100 total edits and have never switched editing interfaces on mobile before will be included in the A/B test.*

*More information about how the A/B test will work can be found below: A/B test information.

26 June 2019: Participating wikis and test start date[edit]

After spending the past few weeks consulting with wikis about the prospect of being a part of the "VE as default" A/B test, we now have a list of 20 wikis that will be participating. These wikis are listed in the table below.

The A/B test itself will start within the next week. We will update this page once the test begins.

Participating wikis
Azerbaijani Danish Hungarian Portugese Swedish
Bulgarian Estonian Malay Romanian Tamil
Cantonese Finnish Malayalam Santali Thai
Croatian Greek Norwegian Serbian Urdu

23 May 2019: Inviting wikis to participate[edit]

We are in the process of coming up with an initial list of wikis to invite to participate in this experiment.

Background[edit]

The mobile wikitext editing interface

We are striving to make editing on mobile web simpler. Research leads us to think newer contributors will have more success editing on mobile using the VisualEditor given it presents a more structured, visual and easily understood interface than wikitext does, the editor that is currently presented to contributors by default on mobile. These are qualities the Design Research team has found to be especially important on mobile, where screen space is more limited than on desktop.

Despite what we have come to think are VisualEditor's advantages, few contributors are discovering and using it. In May 2018 for example, just 1% of mobile web edit sessions (or ~20,000 of the ~2 million total sessions recorded) happened in VE.

Defaults clearly impact how contributors experience editing Wikipedia. Defaults also express unspoken assumptions. Assumptions in this context, about which editing experience will help the most number of contributors be successful editing using a mobile device. This project is about investigating this assumption. Our goal is to learn what mobile editing interface is more likely to cause newcomers to start and continue editing Wikipedia. The key metric we will use to evaluate and compare these editing interfaces is editor retention.

What is changing?[edit]

Switching between wikitext and VE on mobile

Currently, both editing interfaces – VE and wikitext – are available on mobile for contributors to use. Although, the mobile wikitext editor is shown by default to new contributors, while the visual editor is accessible via a dropdown menu in the editing toolbar.

As part of this project, we will be working with a select number of wikis to switch which editing interface is shown to new contributors, on mobile, by default. Meaning, on the subset of wikis we work with, the mobile VisualEditor will be shown to new contributors editing on when they tap "edit" and wikitext will be available via a dropdown menu in the editing toolbar.

This test will not affect your account if any of the following are true:

  • You do not use the mobile site, e.g., https://he.m.wikipedia.org or https://fa.m.wikipedia.org.
  • You have previously switched editing interfaces on the mobile site on that wiki.
    • If you want to see what the mobile visual editor looks like, without accidentally opting your account out of this test, then please visit https://test.m.wikipedia.org/wiki/Sandbox click the pencil icon to open the editor. It will begin in the mobile wikitext editor. Then use the new pencil icon in the toolbar to switch to the mobile visual editor. When you start typing, you will see the toolbar for the mobile visual editor.

A/B test[edit]

The mobile visual editor editing interface

To understand whether VE being the default mobile editing interface creates a "better" editing experience for new contributors, we will be running an A/B test.

Currently, all contributors are presented with the wikitext editor when they first tap "edit" on the mobile website. If a contributor then switches editors using the drop-down in the editing toolbar, their new choice is remembered when they tap "edit" in the future.

This analysis will include contributors who have never published an edit to Wikipedia before. Half of the people included in the analysis will have been randomly selected to receive the mobile visual editor when they first tap "edit", while the remaining half will have received the mobile wikitext editor as before. Contributors who do not edit using the mobile website or who have not published an edit to Wikipedia before will not be included in this analysis.

Once the test is live, the question for us then becomes: which of these editing experiences is a "better" default for newer contributors?

To answer this question, we first need to define what "better" means so we can compare the two test groups and, ultimately, decide whether to explore making the mobile visual editor the default mobile editing interface for more contributors on more wikis.

Considering our priority is the long term health of Wikimedia projects and this "health" depends on new people starting and continuing to edit, we will use editor retention to determine which editing interface is "better" in this context.

There are other metrics that can be helpful in assessing the experience these two editing interfaces are causing people to have. Here are some of the other measures we will consider using to compare the two test groups:

  • Edit completion rate: Do contributors in one test group complete a greater percentage of edits they start than contributors in the other test group?
  • Percent of people who make one successful edit: Is the percentage of contributors who publish at least one edit greater in one test group than the other?
  • Total number of completed edits: Do contributors in one test group complete more edits than contributors in the other test group?
  • Edit quality: Are contributors’ edits in one test group more likely to be reverted than contributors’ edits in another test group?
  • Disruption: Are contributors in one test group switching between editing interfaces more often than contributors in another test group?
  • Load time: Is one editing interface faster to load than the other?

We are not sure if these are the right measures, so we are continuing to think about them. If there is a metric you think we should be considering that is not represented above or a metric you think we should be weighing more heavily than others, please let us know on the talk page. This list is still evolving.


A/B test results[edit]

The A/B test ran at 20 Wikipedias from 1 November 2019 through 26 September 2022 and included new contributors who have not edited on any platform before and who were both logged-in and logged-out.

What follows are the key results from this analysis:

  • The editing interface shown as default on mobile did not significantly increase or decrease the likelihood a person would return to publish an edit on mobile after their first mobile edit.  None of these differences were determined to be statistically significant.
    • 1.7% of people who were shown visual editor as the default editor returned to make one or more additional mobile edits 2 weeks after their first mobile edit compared to 2.0% of people shown wikitext as the default editor.
    • 2.3% of people who were shown visual editor as the default editor returned to make one or more additional mobile edits 2 months after their first mobile edit compared to 2.5% of people shown wikitext as the default editor.
    • 0.9% of registered people who were shown visual editor as the default editor returned to make one or more additional mobile edits 6 months after their first mobile edit compared to 1.0% of registered people shown wikitext as the default editor.
  • People shown visual editor as the default editor were slightly more successful publishing the edits they started and slightly less successful publishing non-reverted edits.  These differences are statistically significant and the absolute difference is small (under 1 percent absolute difference).
    • People who were shown the visual editor as the default editor published the edits they started at a rate (3.1%) that was 11.6% higher than the rate (2.8%) at which the people who were shown wikitext as the default editor.
    • Excluding all reverted edits, people who were shown the visual editor as the default editor published the edits they started at a rate (1.85%) that was 2.0% lower than the rate (1.88%) at which the people who were shown wikitext as the default editor.
  • People shown visual editor as the default editing interface were slightly more likely to successfully publish at least 1 mobile edit.  This difference is statistically significant and the absolute difference is small (under 1 percent absolute difference).
    • 1.3% of people that were shown visual editor as the default editor were able to successfully publish at least one mobile edit during the A/B test compared to 0.9% of people shown wikitext as the default editor. This represents a 44% increase.
  • People shown the visual editor as the default editing interface were more likely to be reverted.  The difference is statistically significant and the absolute difference is medium (under 10 percent absolute difference).
    • Edits by people shown the visual editor as the default editing interface were reverted at a rate (40.1%) 26% higher than the rate at which people were shown wikitext as the default editor (31.9%).

The full A/B test report can be found here: Mobile VE as Default AB Test Analysis Report

Glossary[edit]