Growth/Analytics updates/Help Panel experiment plan/en

The goal of the help panel is to provide users with easier ways to get help without them having to leave the editing context. This can then enable them to complete their tasks while potentially also meeting the Wikipedia community through questions and answers at the community’s help desk. Helping users complete their tasks can lead to an increase editor activation (the proportion of new users who make edits), and potentially also editor retention (the proportion of new users who return to edit a second time), the latter being the Growth Team's overarching goal. This can then enable them to complete their tasks while potentially also meeting the Wikipedia community through questions and answers at the community’s help desk. Helping users complete their tasks can lead to an increase editor activation (the proportion of new users who make edits), and potentially also editor retention (the proportion of new users who return to edit a second time), the latter being the Growth Team's overarching goal.

In order to understand the help panel's impact on editor activation and retention, we propose a six month A/B test. During that test, 50% of new registrations on target wikis will have the help panel enabled by default, and 50% will have it disabled. We are likely to be running other experiments on the target wikis at the same time, for example testing variants of our welcome survey. If those experiments require stratified sampling, we will make sure our sampling strategies are modified accordingly.

For more information on the questions we intend to pursue with the help panel, see this section on the project page. For more information on exactly what data we will be recording, see this EventLogging schema.

Variants
During those six months we also envision testing variants of the help panel to understand how specific interface elements positively affect behaviors inside the workflow of seeking help. We will remember that the stronger our hypothesis that an altered interface will affect activation or retention, the more a test on that interface confounds our longer term experiment on activation and retention.

We will need to prioritize which of the variants to test first, because we will only test one at a time. We will also not run any of the variants until about a month of the larger experiment has run without any of these smaller ones nested inside. This is so that we can learn clearly whether the help panel itself has an effect on new user activation rate.

Leading indicators and plans of action
The duration of the A/B test is six months because it is impossible to detect changes to new editor retention on mid-size wikis in less time than that (unless we drastically impact retention, but we see that as somewhat unlikely). While we wait for our results we want to be able to take action if we suspect that something is amiss. Below, we sketch out a set of scenarios based on the data described in the instrumentation strategy above. For each scenario, we propose a plan of action to take to remedy the situation. While we wait for our results we want to be able to take action if we suspect that something is amiss. Below, we sketch out a set of scenarios based on the data described in the instrumentation strategy above. For each scenario, we propose a plan of action to take to remedy the situation.

Status of leading indicators after one month
The Help Panel was deployed to Czech and Korean Wikipedias on January 11, 2019. One month later we gathered data for all registrations for both wikis up until that point so that we can evaluate our leading indicators and determine whether any of the feature's behavior was concerning. In short, while the evaluation exposes some areas for improvement, we think the help panel's behavior so far is healthy and that it is not having a negative impact on the wikis.

Known test accounts were removed, as were users who turned the Help Panel on or off in their preferences because self-selection into or out of the treatment group violates the equal expectation resulting from random assignment to groups. However, as we will see below, very few users changed their preferences.

There are four thresholds in the table above that are cause for concern, and the list below explains how we're thinking about them.


 * Not opening the help panel: for both wikis, the number is somewhat, but not alarmingly higher than the threshold. This makes us feel like we have healthy open rates that have room for improvement. In that vein, we have started to display the help panel in more places, so that users have more opportunities to open it up. The help panel is now being displayed in reading mode in the Help, Wikipedia, and User namespaces. This work was tracked in T215664 and completed by March 6, 2019.
 * Not clicking help links: since the analysis showed that three Czech links were getting low traffic, we removed the link to more information about notability and replaced it with a link about how to add an image, as the latter was the most frequent question posted to the Help Desk and the most frequently clicked link in Korean. We also relabelled the link to the guide to be labelled "Quick tutorial", because that is what is used in the Korean Wikipedia, where that is the second most frequently clicked link. This work was tracked in T217391 and completed by March 6, 2019.
 * Not asking questions: in retrospect, our expectation that 75% of users who open the help panel (and don't click links) would ask a question was likely too ambitious. We feel comfortable with the rate of questions being asked in Czech, and we are learning that perhaps the paradigm of asking public questions is not a great fit for Korean Wikipedia, given the low rate of questions there. This is one of the reasons we added "search" to the help panel, so that users would have different ways to find help that might fit their own preferences of how to find it. This work was tracked in T209301 and completed by February 25, 2019.
 * Starting the question path and not completing it: the absolute numbers for this metric are still low enough that it is hard to say with confidence whether we are notably higher than the threshold. At the time of this analysis, only about 25 people had attempted asking a question. We will therefore revisit this indicator at a later stage.