User:MWang (WMF)/Draft/sandbox

The Welcome Emails campaign was a collaboration between the Marketing, Community Relations, and Growth teams at the Wikimedia Foundation. Users who registered on the Spanish Wikipedia between 2022-05-10 and 2022-07-01 were asked to consent to receiving emails as part of filling out the Welcome Survey. Users who opted in and who were randomly assigned to a treatment group would later receive a welcome email with information about the movement and a link to the Newcomer Homepage. Users who were randomly assigned to a control group would not receive emails, regardless of whether they had consented to receive them. Similarly, users who did not consent would also not receive any emails.

Summary of findings
Our analysis found that the campaign significantly increased the likelihood that users would return to the wiki and make one or more article edits. Most of this improvement comes through users who registered on the mobile platform, where we see both a significant increase in the probability of returning to edit and in the number of edits they make during the first month after registration. Based on these findings the teams are interested in further exploring the effects of sending emails to new users.

Glossary

 * A Welcome email is an email sent to users who opted in to receiving them. This email was designed by the Marketing team as described on the project page. This email was not sent automatically by MediaWiki, but instead required the Growth team to periodically export a list of users who opted in and share it with the Marketing team, who would then use their emailing tools to send the the Welcome emails.
 * Constructive article activation is defined as making at least one edit in to articles or article talk pages (the Main and Talk namespaces) within a certain time period (called the "activation period"), and that edit not being reverted within 48 hours. In this analysis we use two different time periods:
 * Short: the edits have to be done within 24 hours of registration.
 * Long: the edits have to be done within 1 week of registration.
 * Constructive article retention means a user who was activated makes at least one edit to articles or article talk pages within a certain time period after the activation time period (called the "retention period"). In this analysis we use two different time periods depending on what the activation time period was:
 * Short: two weeks after the initial 24 hours.
 * Long: three weeks after the initial week.
 * Productivity is referring to the number of edits made to article and article talk pages within a certain time period. In this analysis we will only count non-reverted edits, and the time period always spans both the activation and retention time frames. This results in two time periods as follows:
 * Short: 15 days (24 hours + 14 days)
 * Long: one month (1 week + 3 weeks)

Activation
A newcomer is activated when they make their first article (or article talk) edit within the short or long time spans, and that edit is not reverted. We find that the campaign had no effect on activation across either the short or long time spans. Table 1 gives an overview of the proportion of newcomers who were activated by making constructive article edits for both the Treatment and Control groups, and for both the short (24 hours) and long (1 week) time periods.

Retention
A newcomer is retained when they have previously been activated and they return to edit again within the retention period. In our case we only count non-reverted edits to articles and article talk pages, and use two or three week retention periods depending on the length of the activation period, as described above.

We find that the campaign increased retention, both for the short and long time periods. The effect is relatively moderate for the short time period and relatively large for the longer time period, as shown in Table 2.

Returning to edit
Being retained requires a user to also be activated. In the case of this campaign, if we only investigate retention we might miss the campaign having an effect on users who had not edited. We did an investigation half way through the campaign in order to determine whether it should end or continue, and found that it appeared to have a positive impact on users who had not activated (within the short time period). Based on this we also chose to analyze the campaign's effect by comparing users who had not activated to those who had.

We find that the campaign's effect depended on whether a user registered on the desktop or mobile platform. For users who registered on the desktop website, we found no difference in the likelihood of users returning to the wiki to edit articles, while for users who registered on the mobile website we found significant increases in the probability that users returned.

Table 3 breaks these effects down in more detail. We show proportions for both users who did and did not activate. For users who activated we see a similar difference in the short time frame as we saw for overall retention, while for the long time frame we see a larger increase. We see small absolute differences for users who did not activate, but because the editing rates are very low they result in very large relative differences.

Productivity
When we say "productivity" we are referring to the number of non-reverted article and article talk edits made over a certain time period, as described above. Because we found the campaign to have stronger impacts over the long time period, we chose to focus our analysis of productivity on that time period as well.

We find similar results on productivity as we did for users returning to edit: users who registered on the desktop platform saw no effect of the campaign, whereas users who registered on the mobile platform appear to have a small but significant increase in the average number of article edits made. Table 4 gives an overview, using the geometric mean to account for the large variation in number of edits each user makes (a lot of users make no or few edits, and a few users make a lot of edits).

Methodology
We gathered all users registered on the Spanish Wikipedia between launch of the campaign on 2022-05-10 until the campaign ended on 2022-07-01. We excluded known test accounts, autocreated accounts, and accounts that were registered through the API (those are typically mobile app registrations). Our dataset contains 22,633 accounts. 20% of the accounts (4,474 users) were randomly assigned to the Control group and were not eligible to receive emails, the remaining 18,159 accounts were in the Treatment group and would receive emails if they consented to it.

We used MediaWiki history for data on the editing activity of all users as that is our authoritative source of editing data. Our approach for analyzing activation and retention is similar to that of previous Growth team analyses (e.g. the Add a Link experiment analysis), except we do not have to account for multiple wikis. Our analysis of productivity use a Bayesian approach and accounts for zero-inflation, meaning it takes into consideration the fact that some users register and are never going to edit.