User:MPopov (WMF)/Android/Notifications

From MediaWiki.org
Jump to navigation Jump to search

Introduction[edit]

In October 2018, the Android team released an update to the Wikipedia app which added support for Notifications as part of the annual plan to improve in-app editing. Starting with version 2.7.262, users could interact with welcome, milestone, and thanks notification types inside the app and in the Android OS:

Screenshots demonstrating the Notifications feature on Android

Impact Goals[edit]

With this release we were targeting:

  • 10% increase in 7-day retention rate
    • a new editor is 7-day-retained if they make an edit (either article edit or a Wikidata description edit) in first week after creating account and then edit again in the second week
  • 10% increase in edit rate (edits per user)
    • the total number of article edits and Wikidata description edits made specifically with the Wikipedia Android app in the user's first 30 days after creating their account

with the hypothesis that new Android editors receiving notifications on their device welcoming them to Wikipedia and congratulating them on reaching editing milestones would encourage them to edit more. Among the 13 targeted languages (cf. Data Collection section), we observed the following:

  • The average 30-day edit rate increased by 92% (from an average of 2.69 edits/user to an average of 5.17 edits/user)
  • The average 7-day retention rate increased by 69.7% (from an average of 6.1% to an average of 10.3%)

While there were differences in how the two metrics were observed to change on a wiki-by-wiki basis (refer to Results section for details), the observed overall change appears to be positive. As mentioned in the Discussion section, this is purely correlation; the non-randomized-controlled-trial nature of the release prevents us from drawing conclusions about any causal relationship between release of notifications on Android and the feature's impact on new editors.

Data Collection[edit]

Using a combination of edit history in the Data Lake, EventLogging-based tracking of interactions with notifications, and data related to Echo notifications, we were able to see how engagement with notifications correlates with editing activity and new editor retention. We focused specifically on article edits made on the following Wikipedias:

  • Arabic
  • Bangla
  • Chinese
  • Finnish
  • Hebrew
  • Hindi
  • Italian
  • Marathi
  • Persian
  • Portuguese
  • Russian
  • Swedish
  • Tamil

and Wikidata description edits made in those languages. Refer to Measuring Impact for more information on how these target wikis were chosen.

Results[edit]

Visual comparison of edit & retention rates before and after the Notifications (and Nav Update) release at the end of Oct 2018. The thick black line is the average across languages.

We compared edit rate and retention rate among new Android editors between July 2018 and February 2019. Both Wikipedia article edits and Wikidata description edits counted towards these two metrics. Since we relied on the February 2019 snapshot of editing history in the Data Lake, we employed the following conditions:

  • For the edit rate – since it relies on a 30-day window after the user's registration – we excluded users who registered their account after 29 January 2019.
  • For the retention rate – since it relies on a 14-day window after the user's registration – we excluded users who registered their account after 14 February 2019.

The before/after values are presented in the table below and the figure to the right. Among these target languages:

  • The average 30-day edit rate increased by 92% (from an average of 2.69 edits/user to an average of 5.17 edits/user)
  • The average 7-day retention rate increased by 69.7% (from an average of 6.1% to an average of 10.3%)
Comparison of edit & retention rates before and after the Notifications (and Nav Update) release at the end of Oct 2018
30-day edit rate 7-day retention rate
Language Before After % change Before After % change
Arabic 6.92 14.24 +106% 8.62% 14.29% +69.3%
Bangla 2.5 2.91 +16.8% 1.96% 5.00% +136%
Chinese 1.72 4.29 +149% 4.00% 11.40% +168%
English 2.54 4.82 +89.8% 9.57% 6.80% -16.7%
Finnish 1.79 2.31 +29.4% 0.00% 12.50%
German 2.82 3.35 +19.4% 9.09% 11.54% +34.2%
Hebrew 2.85 2.58 -9.12% 21.05% 5.88% -64.3%
Hindi 2 2.04 +2.49% 7.14% 3.85% -28.2%
Italian 3.78 3.75 -0.601% 16.67% 10.53% -29.1%
Marathi 2.23 4.94 +121% 0.00% 0.00%
Persian 2.63 3.81 +45.4% 7.69% 12.28% +64.3%
Portuguese 3.96 9.68 +144% 9.09% 12.50% +43.7%
Russian 2.06 4.96 +140% 2.70% 11.22% +257%
Swedish 0.91 5.78 +531% 0.00% 25.00%
Tamil 1.67 5.5 +229% 0.00% 9.52%

English and German included for reference

As an interesting extra result, we also found that there were editors who were interacting with very old notifications, including ones from 2013.

Discussion[edit]

Funnel of Android users becoming new editors through the Wikipedia Android app

Because we did not roll this feature out as a randomized controlled trial, it is hard to infer causal relationships between notifications and edit & retention rates. We may simply compare the observed editing activities and retention rates, but cannot attribute the changes to notifications with certainty. Even comparing editing activities between users who interacted with notifications and those who did, we cannot infer any causal relationships. If notification usage is X, a metric is Y, then there could be an unobserved confounding variable Z (e.g. user's predisposition to be an active contributor) which affects both X and Y.

For example, people who use their phone more and/or have a prior history with editing wikis may be more likely to become active contributors and are more likely to use notifications. Although with the rate of new Android editors on these wikis and low engagement with notifications, any experiment would need to run for a very long time before we would have enough data to perform any hypothesis testing with enough power (see funnel visualization). Furthermore, the release of notifications was combined with the release of the navigation update, so it is difficult to attribute the changes to one or the other.

The recommendation going forward is that we should strive to roll out new features as A/B tests in order to reliably assess their potential impact.