Growth/Analytics updates/Welcome survey editor activation rate

Summary
When we deployed the "welcome survey" to Czech and Korean Wikipedia, our main concern was that it would lead to fewer newly registered users becoming editors within the first 24 hours after registering (what we call "editor activation"). This concern is described in our experiment plan, and is why we ran the survey as a randomized A/B test over the course of a month: half of new registrations received the survey immediately after creating their account, half did not get the survey at all and returned to the context they were in before creating their account. In this update, we describe the results of that A/B test, answering the question: does having the welcome survey affect editor activation rate? We find that there does not appear to be a statistically significant difference in activation rate between the survey and control groups in either of the two Wikipedias.

In testing Arabic Wikipedia in August 2019, we also find that there is no statistically significant difference in activation rate between the survey and control groups.

Next, the Growth team will test a more heavily designed version of the survey, Variation C, against the simple Variation A tested in the experiment discussed on this page. –

Background
The survey was deployed to Czech and Korean Wikipedias on November 19, 2018, shortly after 19:00 UTC. In this analysis, we use data from deployment until December 25, so as to use whole weeks. While we had one more week of data available at the time of analysis, we discarded it due to a spambot attack affecting registrations on Korean Wikipedia.

In addition to limiting accounts by date of registration, we also apply several other filters:


 * First, the survey is only shown to users who register on the given wiki, so we filter out users who already had accounts on another wiki (also known as "autocreated accounts").
 * Secondly, we filter out accounts created through Wikipedia's API as those are mainly accounts created from the Wikipedia Android and iOS apps, and the survey is not running on either of those apps.
 * Lastly, we remove known test accounts created by Growth Team members.

Results
Our dataset contains 1,617 accounts in Czech Wikipedia and 2,140 accounts in Korean Wikipedia. For each of these accounts, we use Wikipedia's edit history to calculate whether a user made at least one edit within 24 hours after registration. We only calculate this for the first 24 hours because our previous analysis revealed that users who become editors are most likely making that transition quickly, only about 10% of those who ever make an edit make their first edit later than 24 hours after registration. With data on whether they edit, and whether they were shown the welcome survey or not, we can create 2x2 contingency matrices for both wikis. They are shown in Tables 1–4 below. Note: in Tables 2 & 4, proportions are calculated per row. This is to make it easier to compare activation rates for the survey and control groups.

The proportions shown in Tables 2 & 4 suggests conflicting trends between the survey and control groups in the two wikis. In Czech, the survey group has a slightly higher activation proportion (by 3.3pp), while in Korean it is slightly lower (by 3.6pp). But, are any of these differences statistically significant?

The answer is "no". Using a two-sample test of equality in proportions we find that neither the difference in Czech ($$\Chi^2=1.61, df=1, p=0.20$$) nor Korean ($$\Chi^2=2.77, df=1, p=0.095$$) is statistically significant.

Addendum: Arabic Wikipedia
We deployed the Welcome Survey to Arabic Wikipedia on July 15, 2019, using the same A/B test configuration we used for Czech and Korean. Users who signed up on Arabic Wikipedia would be randomly assigned to either a survey or control group with 50% probability. In our analysis of the results we used five whole weeks of data, meaning all registrations up until August 26, 2019. As we did for Czech and Korean, auto-created accounts are removed from our dataset because the survey is only shown to local registrations. We also remove accounts created through the API (these are primarily app accounts), and known test accounts.

Our dataset contains 9,524 accounts, of which 4,808 (50.5%) were in the control group, and 4,716 (49.5%) were in the survey group. Similarly as for Czech and Korean Wikipedia, we use the MediaWiki databases to count how many edits these users made in the first 24 hours after registration. From this data we can then create 2x2 contingency tables, seen in Tables 5 & 6 below. Note: in Table 5 proportions are calculated per row. This is to make it easier to compare activation rates for the survey and control groups.

There is a small difference in activation between the control and the survey group, but this difference is not statistically significant ($$\Chi^2=0.47, df=1, p=0.50$$). Therefore, we conclude that the welcome survey does not negatively impact whether newcomers stay on the wiki and make contributions.