Growth/Personalized first day/Structured tasks/Add a link/Experiment analysis, December 2021

In May 2021, the Growth team added the "Add a Link" structured task to the "newcomer tasks" feature on the newcomer homepage. Structured tasks are ones that can be broken down into step-by-step workflows with simple steps that make sense to newcomers, and that are easy to do on mobile devices. Our hypotheses were that users would be more likely to complete Add a Link tasks than the unstructured task, and that this would result in an increase to our core metrics without decreasing edit quality.

To learn more about the impact of Add a Link, we deployed the feature in a controlled experiment. Users were randomly assigned to one of three groups: a control group that did not get access to any of the team's features (20%), a group that got access to the Growth features and where the default task assigned in the "newcomer tasks" module was Add a Link (40%), and a group that got access to the Growth features with the default task assigned being the unstructured "add links" task (40%). The experiment started on May 27, 2021 with four Wikipedias: Arabic, Bengali, Czech, and Vietnamese. Six additional Wikipedias were added on July 21 of the same year: Russian, French, Polish, Romanian, Persian, and Hungarian. Our analysis uses data from these wikis to October 14, 2021.

Summary of findings
In general, the analysis showed that the Add a Link structured task improves outcomes for newcomers. The most important points are:


 * Newcomers who get the Add a Link structured task are more likely to be "activated" (i.e. make a first article edit).
 * They are also more likely to be retained (i.e. come back and make another article edit on a different day).
 * The feature also increases edit volume (i.e. the number of edits made across the first couple weeks), while at the same time improving edit quality (i.e. the likelihood that the newcomer's edits are reverted).

Glossary

 * As of mid-March 2022, eleven wikis have the Add a Link structured task. In our experiment we analyze data from ten of them: Arabic, Bengali, Czech, Vietnamese, Russian, French, Polish, Romanian, Persian, and Hungarian.
 * Not all newcomers received the Add a Link structured task: 20% of users were randomly chosen to get the default newcomer experience, which does not have access to any of the Growth features. We refer to this group as the control group. 40% were randomly chosen to get the Growth features with the default task in the newcomer tasks module set to the Add a Link structured task, and we call that group the Add a Link group. The remaining 40% were also given the Growth features but with no access to the Add a Link structured task and the default task in the newcomer tasks module set to the unstructured link task. We refer to this group as the unstructured task group.
 * Constructive activation is defined as a newcomer making their first edit within 24 hours of registration, and that edit not being reverted within 48 hours. The baseline constructive activation rate is the rate of constructive activation for the control group.
 * Activation is similarly defined as constructive activation, but without the non-revert requirement.
 * Constructive retention is defined as a newcomer coming back on a different day in the two weeks after constructive activation and making another edit, with said edit also not being reverted within 48 hours.
 * Retention is similarly defined as constructive retention, but without the non-revert requirements.
 * Constructive edit volume is the overall count of edits made in a user's first two weeks, with edits that were reverted within 48 hours removed. The baseline constructive edit volume is the count for users in the control group.
 * Revert rate is the proportion of edits that were reverted within 48 hours out of all edits made. This is by definition 0% for users who made no edits, and we generally exclude these users from the analysis.

Detailed findings
Below are the specific impacts estimated from the controlled experiment. These are based on observing 130,179 new accounts on the ten wikis between May and October 2021. For more specifics on the methodology, see "Methodology" below.





Activation
In this analysis, we focus on the Article and Article talk namespaces because 1) the Add a Link task asks users to edit articles, and 2) we've seen significant positive effects on activation in related analyses.


 * Activation: newcomers who get the Add a Link structured task are 11.7% more likely to make a first article edit compared to the baseline. Across our dataset, the baseline activation rate in the Control group is 27.2%. The activation rate for users who get the Add a Link task is 30.4%, which is an 11.7% relative increase over the baseline. Users in the unstructured task group have an activation rate of 28.6% (5.4% relative increase over the baseline), a difference that is both significantly lower than the Add a Link group and significantly higher than the control group.
 * Constructive activation: we find a larger effect of the Add a Link structured task when it comes to non-reverted activation edits. For the Control group the baseline constructive activation rate is 20.7%. The rate in the Add a Link group is 24.2%, which is a 16.6% relative increase over the baseline. Users who get the unstructured link task have a constructive activation rate of 22.0% (5.9% relative increase over the baseline), which is again a difference that is both significantly higher than the Control group and significantly lower than the Add a Link group.

When measuring across all namespaces, we find that the Control group and Unstructured task group have almost identical activation rates. The baseline activation rate in the Control group is 37.7%, and the baseline constructive activation rate is 31.4%. The activation rate in the Add a Link group is significantly higher in both cases: overall activation rate 39.0%, which is a relative increase of 3.6%; constructive activation rate 33.1%, which is a relative increase of 5.4%.

Retention




Similarly as we did for activation, we focus on the Article and Article talk namespaces for measuring retention.


 * Retention: newcomers who get the Add a Link structured task are 13.0% more likely to return and make one or more additional article edits compared to the baseline. The baseline retention rate in the Control group is 3.6%, while the retention rate in the Add a Link group is 4.1% (for a 13.0% relative increase). For the Unstructured task group the retention rate is 3.9%, a difference that is neither significantly higher than the Control group nor significantly lower than the Add a Link group.
 * Constructive retention: we also find a larger relative increase in retention for the Add a Link structured task group when it comes to non-reverted retention edits. The baseline constructive retention rate in the Control group is 2.9%, while the rate in the Add a Link group is 3.4%, which is a 16.2% relative increase. Newcomers in the Unstructured task group have a retention rate of 3.2%, which again is a non-significant difference relative to the Control group as well as the Add a Link group.

We also find that retention is strongly associated with the amount of activity a newcomer has on their first day. This is expected, as it is something we have seen in previous analyses of Newcomer Tasks. When taking first day activity into account, the above mentioned differences in retention disappear. In our earlier analysis of Newcomer Tasks we hypothesized that the Growth features have a positive impact on retention because they have a positive impact on activation, and that impact then sustains into the retention period. These findings from the Add a Link experiment provide further support for that hypothesis.

Edit volume


Like we've done for activation and retention, we also focus on the Article and Article talk namespaces when it comes to our analysis of edit volume. We also limit this analysis to only counting constructive (non-reverted) edits. This is partly due to the amount of time it takes to complete one statistical model (24–48 hours). It's also due to the activation, retention, and revert rate analyses all showing positive indicators before we started the edit volume analysis.

We find that the constructive edit volume during the first two weeks for newcomers who get the Add a Link structured task is 18.7% higher than the baseline. The baseline constructive edit volume in the Control group is 0.319 edits, whereas in the Add a Link group it is 0.378 edits. This difference of 0.059 edits is a relative increase of 18.7%. In other words:


 * 1,000 newcomers without Add a Link would make 319 constructive article edits.
 * 1,000 newcomers with Add a Link would make 378 constructive article edits.

This increase reflects both that Add a Link increases the likelihood that a newcomer makes an initial article edit and that some newcomers go on to make multiple edits.

We do not find a significant difference in edit volume between the Control group and the Unstructured task group.

Revert rate


When it comes to reverts, we again focus on the Article and Article talk namespaces because that is where Add a Link asks newcomers to edit. Secondly, it does not make sense to measure reverts for users who make no edits, so this analysis is limited to users who made at least one edit in those namespaces in the first two weeks after registration.

We find that the revert rate for newcomers who get Add a Link is 11% lower than the baseline. Newcomers in the Control group have a baseline revert rate of 26.7%, while newcomers in the Add a Link group have a revert rate of 23.8%. This difference of -2.9% is a relative decrease of 11.0%.

We do not find a significant difference in the revert rate between the Control group and the Unstructured task group.

Methodology
The Growth team deployed Add a Link to four Wikipedias on May 27, 2021: Arabic, Bengali, Czech, and Vietnamese. Six additional Wikipedias were added on July 21 of the same year: Russian, French, Polish, Romanian, Persian, and Hungarian. During the experiment, users were randomly assigned to one of three experiment groups: Control (20%), Add a Link (40%), and the Unstructured link task (40%). In the Control group, newcomers receive no access to any of the Growth features. Users in the Add a Link and Unstructured link task groups receive access to the Growth features, but differ in what type of tasks they have available in the Newcomer Tasks module. Users in the Add a Link group have that as the only default task, while users in the Unstructured link task has that as the only default task. Users in the two groups cannot change the type of link task they have access to, but they can turn on the other tasks available (e.g. copy edits, adding references).

Users can turn the Growth features on and off in their user preferences at any point. If we find indications that they've done so, we exclude them from analysis. We also exclude known test accounts, users who registered through the API (these are mainly app registrations), bot accounts, and accounts that are autocreated.

The dataset for this analysis contains 130,179 accounts registered between the start of the experiment and October 14, 2021. Of these, 27,336 (20.1%) are in the Control group, 51,364 (39.5%) are in the Add a Link group, and 51,479 (39.5%) are in the Unstructured task group.

Our analysis makes extensive use of multilevel (hierarchical) regression models, using the wiki as the grouping variable. This allows us to account for differences between the wikis in our analysis. For example, our activation models are multilevel logistic regression models, which means that they account for the inherent differences in activation rate between the wikis. We also know that editing activity follows a long tail distribution, and therefore model number of edits made using a zero-inflated negative binomial distribution. This model is also multilevel to allow both zero-inflation and the negative binomial distribution to vary by wiki. Lastly, our revert rate analysis uses a zero-one-inflated beta distribution. This is because revert rates calculated across a time window tends to fall into three categories: 1) the user has all of their edits reverted (one-inflation), 2) the user has none of their edits reverted (zero-inflation), and 3) the user has some of their edits reverted (resulting in a beta distribution). We again use a multilevel model so that these are calculated per wiki.