Wikimedia Apps/Team/iOS/Analysis

From mediawiki.org
< Wikimedia Apps‎ | Team‎ | iOS

Data Sources[edit]

Webrequest[edit]

The webrequest datasets contain logs of all the hits to the WMF's servers (specifically, the Varnish servers). This includes requests for page HTML, images, CSS, and Javascript, as well as requests to the API. For privacy reasons, this data is purged after 90 days. We use the webrequest table and its aggregated tables for app analysis. Specifically:

  • mobile apps uniques: Count how many different Android and iOS Wikipedia mobile apps installs accessed Wikimedia sites during the given day or month, determined as the number of app uuids appearing in the webrequest table (via the X-Analytics header or formerly the query part of the URL).
  • mobile apps session metrics: Contain aggregate stats about pageview sessions on the Android and iOS Wikipedia mobile apps, updated weekly. Please note the caveats of this dataset as the metrics may not be what you expected.
  • Pageview_hourly

Please note that starting from v5.0, webrequest data doesn't contain users' appInstallID if they choose not to send usage reports to the Wikipedia iOS app[1], regardless of whether they agree to send analytics data to Apple at the system level or not.[2] This also means that metrics based on this data may be biased towards app version older than 5.0. Please use filter accordingly.

Since the iOS app uses API for most transactions, it also makes sense to use the ApiAction data, which is logged directly from MediaWiki when it responds to API requests. The benefits of that data include:[3]

  • Includes internal API requests not routed through the Varnish servers
  • Data on API requests is not mixed in with data on regular web requests.
  • Contains detailed data on the content of POST API requests (they are logged in webrequests, but without the request bodies).

Event logging schema[edit]

We use EventLogging to collect metrics about how users interact with the app. For privacy reasons, event logging data is purged after 90 days unless they are white-listed.[4] We only collect data from users who agree to share their usage report with Wikipedia app from v5.0. This means that metrics based on event logging data may be biased towards app version older than 5.0. Please use filter accordingly.

Currently, the Wikipedia iOS app is sending data to the following event logging schemas: MobileWikiAppDailyStats, MobileWikiAppCreateAccount, MobileWikiAppEdit, MobileWikiAppProtectedEditAttempt, MobileWikiAppToCInteraction, MobileWikiAppSearch, MobileWikiAppShareAFact, MobileWikiAppSavedPages, MobileWikiAppLogin. There are some schemas not used in the current version: MobileWikiAppNavMenu, MobileWikiAppArticleSuggestions. Please note that the revisions of schemas iOS is using are different from Android. Check the schema discussion pages and the result of audit for more details.

From v5.8.2, we started to implement new event logging schemas. See T192819 for more information.

Sampling: Starting from v5.8.1, all the events on iOS app are 100% sampled.

Related bugs:

iTunes Connect and App Annie[edit]

iTunes Connect provides various metrics to measure the app's performance in App Store such as impression, product page views and app units. It also provides some usage metric from users who agree to share their diagnostics and usage information with Apple, such as sessions, active devices and retention. Within iTunes Connect, App Analytics displays data from devices running iOS 8 or tvOS 9, or later, and that data is displayed only when a certain number of data points are available; Sales and Trends is recorded when a customer initiates a transaction on the App Store. There are some differences in the units reporting between App Analytics and Sales and Trends for the following reasons[5]:

  • App Analytics only displays data from devices running iOS 8 or tvOS 9, or later. Sales and Trends displays all sales data from devices running iOS, tvOS, and macOS.
  • App Analytics data is based on Coordinated Universal Time (UTC). By default, Sales and Trends data is shown in Coordinated Universal Time (UTC), but users can change the time zone to Pacific Standard Time (PST).

Links to documentation of iTunes Connect:

App Annie is a third party app market data provider. It is integrated with iTunes Connect and basically provides the same metrics as iTunes Connect and App Store, but with some estimation and better visualization. We've seen some discrepancies between the numbers provided by iTunes Connect and App Annie. Some can be explained by that App Annie's data is based on Pacific Time[6].

Piwik[edit]

Matomo, formerly Piwik, is a free and open source web analytics application that runs on a PHP/MySQL webserver. The Wikipedia iOS app started to use Piwik to collect user interaction events since 2015. However, our Piwik server cannot handle the amount of traffic from iOS Wikipedia app and has failed several times[7]. We are planing to stop using Piwik in the future.

Link to the dashboard: https://piwik.wikimedia.org/index.php?module=CoreHome&action=index&idSite=3&period=day&date=yesterday

Please note that the probability of an event actually being sent to the Piwik server is 1:10. Sampling at the event level prevent us from analyzing user conversion.

Others[edit]

  • Reading infrastructure team set up several reading lists tables in the wikishared database (schema).

Metrics[edit]

Among all the metrics listed below, the followings are the core metrics for the iOS Wikipedia app that WMF has been tracking on a regular basis since around 2015, reported in the Readers team's quarterly metrics meetings and (partially) in the Audiences department's monthly readers metrics update: daily pageviews, daily installs (aka. app units), 7-day retention, ratings, pageviews per session, session length and sessions per user.

Browse and Install[edit]

Metric Definition Source Dimension[8] Notes
Impression The number of times (or the number of unique devices) the app was viewed in the Featured, Categories, Top Charts, and Search sections of the App Store. Includes Product Page Views. Based on devices running iOS 8 or tvOS 9, or later. iTunes Connect device, platform version, region, territory, source type, app referrer, web referrer, campaign
Product Page Views The number of times (or the number of unique devices) the app’s product page has been viewed on a device using iOS 8 or tvOS 9, or later. Includes views on the App Store and within apps that use the StoreKit API to load our app's product page. iTunes Connect; App Annie device, platform version, region, territory, source type, app referrer, web referrer, campaign It is called store views on App Annie. App Annie's data is delayed and not the same as iTunes Connect, but the discrepancy is not very big.
App units (aka. Downloads or Installs) The number of first-time purchases of the app made on the App Store on devices running iOS 8 or tvOS 9, or later. The number is based on Apple ID. For number of units including earlier platform versions and desktop, check iTunes Connect "Sales and Trends". iTunes Connect; App Annie device, platform version, region, territory, source type, app referrer, web referrer, campaign As mentioned before, the units reported by iTunes Connect "Sales and Trends" is different from "App Analytics" because of platform and time zone.[5] When reporting the downloads number, we should treat the downloads from desktop carefully, since they are most likely to be the result of Volume Purchase Program or other unusual behaviors[9], and does not reveal the "real" number of downloads.

About App Annie's data:

- The units reported by iTunes Connect "Sales and Trends" (when the time zone is set to PST) is equal to the sum of App Annie's downloads and promotions under the "Downloads" tab.

- The downloads reported by App Annie under the "Store Views" tab is equal to the units reported by iTunes Connect "Sales and Trends" (when the time zone is set to PST) excluding desktop units.

- The promotions under the "Downloads" tab are downloads with a promo code, most likely from Volume Purchase Program.[9]

Updates The number of users who have the app installed, and have installed an updated version. iTunes Connect; App Annie device, territory, app version The number of updates reported by iTunes Connect "Sales and Trends" (when the time zone is set to PST) is equal to App Annie's updates under the "Downloads" tab.
Re-downloads The number of times users re-download the app. iTunes Connect device, territory, app version

Useful ratio:

  • Conversion Rate: The ratio of app units to impression or product page views. Depending on your purpose, either the count of views or unique devices can be the denominator.

Usage[edit]

Metric Definition Source Dimension[8] Notes
Opt-In Rate The proportion of users of Wikipedia app who have agreed to share their diagnostics and usage information with Apple. Based on devices running iOS 8 or tvOS 9, or later. iTunes Connect Only 30 days' rolling average is available.
Installations The total number of times the app has been installed on devices with iOS 8 or tvOS 9, or later. Re-downloads on the same device, downloads to multiple devices sharing the same Apple ID, and Family Sharing installations are included. iTunes Connect app version, device, platform version, region, territory, source type, app referrer, web referrer, campaign Based on users who agree to share their diagnostics and usage information with Apple.
Sessions iTunes: The number of times the app has been used for at least two seconds on devices with iOS 8 or tvOS 9, or later. If the app is in the background and is later used again, that counts as another session. iTunes Connect; App Annie app version, device, platform version, region, territory, source type, app referrer, web referrer, campaign Based on users who agree to share their diagnostics and usage information with Apple. App Annie extrapolates this metric with opt-in rate.

Mobile apps session metricsalso counts the number of sessions per user. However, this count is based on pageview timestamp and doesn't take other actions on the app into account. See the code for how it is calculated.

Daily Active Devices iTunes: The number of devices with at least one session during the day. Based on devices running iOS 8 or tvOS 9, or later. Based on users who agree to share their diagnostics and usage information with Apple. App Annie extrapolates this metric with opt-in rate. iTunes Connect; App Annie app version, device, platform version, region, territory, source type, app referrer, web referrer, campaign Before March 2016, we reported this metrics with mobile_apps_uniques_daily. This is no longer valid since the release of v5.0 (see T130432).

Counting unique IDs in MobileWikiAppDailyStats can also tell us the daily active devices. However, this number seems to be too small (see T193917).

30 Day Active Devices iTunes: The number of active devices with at least one session during the previous 30 days. Based on devices running iOS 8 or tvOS 9, or later. Based on users who agree to share their diagnostics and usage information with Apple. App Annie extrapolates this metric with opt-in rate. iTunes Connect; App Annie app version, device, platform version, region, territory, source type, app referrer, web referrer, campaign As daily active devices, we can also get the number from mobile_apps_uniques_monthly and MobileWikiAppDailyStats after related issues are fix (T193917).
Crashes The total number of crashes on devices running iOS 8 or tvOS 9, or later. Get detailed crash logs and crash reports in Xcode, such as unique totals for each type of crash and how many users experienced it. iTunes Connect app version, device, platform version Based on users who agree to share their diagnostics and usage information with Apple.
Retention The percentage of users that first installed the app on a given day 0 and used it again on day 0+X. For example, if our app was first downloaded by customers on 100 devices on May 1st, and seven days later (on May 8th) 20 devices are still active with at least one session, then retention on May 8th is 20% (or 20 active devices out of 100). iTunes Connect; or compute it with MobileWikiAppDailyStats app version, device, region, source type, app referrer, web referrer, campaign Retention rate on iTunes Connect is based on users who agree to share their diagnostics and usage information with Apple; retention rate computed with MobileWikiAppDailyStats (schema implementation: T126693) is based on users who agree to share their usage data with Wikipedia app. We are investigating some related bugs (T189356, T189359) thus the retention rate computed with MobileWikiAppDailyStats should be treated carefully.
Rating Rating is purely 1 - 5 stars given by the user. A review is where the user gives the app 1 - 5 stars and also leaves a text comment. So, all Reviews also contain a rating, not all ratings are reviews. App store; App Annie app version, country, ratings On the Reviews page App Annie only calculates the ratings that were left as part of a review.
Daily Pageviews The number of pageviews from iOS app during the day. Pageview_hourly There was a bug with the release of v5.3.2 which boost the number of pageviews. It was fixed with the release of v5.4. See https://phabricator.wikimedia.org/T154735
Session length Currently, session length is calculated as the difference between the first and last pageview timestamp in a session. mobile apps session metrics Based on users who agree to share their usage data with Wikipedia app and with at least two pageviews. This session length should be less than or equal to the actual session length on the client side.

Useful ratio:

  • Daily Active Devices / 30 Day Active Devices: The ratio of Daily Active Devices to 30 Day (roughly monthly) Active Devices displays an engagement level of the active user base . This ratio reveals how much of our monthly active user base checks in on a daily basis.
  • Sessions per Active Device: The ratio of Daily Sessions to Daily Active Devices is used to understand the number of sessions an average active device launches each day. We also reported sessions per user in mobile apps session metrics. However, it is based on users who have at least one pageviews (thus the count -- the number of users -- is less than the number reported by mobile_apps_uniques), and a session is defined as a sequence of pageviews from the same app ID that does not exceed 30 minutes of inactivity.
  • Pageviews per session: Reported in mobile apps session metrics. This is based on users who have at least one pageviews and a session is defined as a sequence of pageviews from the same app ID that does not exceed 30 minutes of inactivity.

See also[edit]

Reference[edit]