Wikimedia Apps/Team/iOS/Analysis

Webreqeust
The webrequest datasets contain logs of all the hits to the WMF's servers (specifically, the Varnish servers). This includes requests for page HTML, images, CSS, and Javascript, as well as requests to the API. For privacy reasons, this data is purged after 90 days. We use the webrequest table and its aggregated tables for app analysis. Specifically:
 * mobile apps uniques: Count how many different Android and iOS Wikipedia mobile apps installs accessed Wikimedia sites during the given day or month, determined as the number of app uuids appearing in the webrequest table (via the X-Analytics header or formerly the query part of the URL).
 * mobile apps session metrics: Contain aggregate stats about pageview sessions on the Android and iOS Wikipedia mobile apps, updated weekly. Please note the caveats of this dataset as the metrics may not be what you expected.
 * Pageview_hourly

Please note that starting from v5.0, webrequest data doesn't contain users' appInstallID if they choose not to send usage reports to the Wikipedia iOS app, regardless of whether they agree to send analytics data to Apple at the system level or not. This also means that metrics based on this data may be biased towards app version older than 5.0. Please use filter accordingly.

Since the iOS app uses API for most transactions, it also makes sense to use the ApiAction data, which is logged directly from MediaWiki when it responds to API requests. The benefits of that data include:


 * Includes internal API requests not routed through the Varnish servers
 * Data on API requests is not mixed in with data on regular web requests.
 * Contains detailed data on the content of POST API requests (they are logged in webrequests, but without the request bodies).

Event logging schema
We use EventLogging to collect metrics about how users interact with the app. For privacy reasons, event logging data is purged after 90 days unless they are white-listed. We only collect data from users who agree to share their usage report with Wikipedia app from v5.0. This means that metrics based on event logging data may be biased towards app version older than 5.0. Please use filter accordingly.

Currently, the Wikipedia iOS app is sending data to the following event logging schemas: MobileWikiAppDailyStats, MobileWikiAppCreateAccount, MobileWikiAppEdit, MobileWikiAppProtectedEditAttempt, MobileWikiAppToCInteraction, MobileWikiAppSearch, MobileWikiAppShareAFact, MobileWikiAppSavedPages, MobileWikiAppLogin. There are some schemas not used in the current version: MobileWikiAppNavMenu, MobileWikiAppArticleSuggestions. Please note that the revisions of schemas iOS is using are different from Android. Check the schema discussion pages and the result of audit for more details.

From v5.8.2, we started to implement new event logging schemas. See T192819 for more information.

Sampling: Starting from v5.8.1, all the events on iOS app are 100% sampled.

Related bugs:
 * MobileWikiAppDailyStats: T189356, T189359
 * MobileWikiAppSearch: T192520

iTunes Connect and App Annie
iTunes Connect provides various metrics to measure the app's performance in App Store such as impression, product page views and app units. It also provides some usage metric from users who agree to share their diagnostics and usage information with Apple, such as sessions, active devices and retention. Within iTunes Connect, App Analytics displays data from devices running iOS 8 or tvOS 9, or later, and that data is displayed only when a certain number of data points are available; Sales and Trends is recorded when a customer initiates a transaction on the App Store. There are some differences in the units reporting between App Analytics and Sales and Trends for the following reasons : Links to documentation of iTunes Connect: App Annie is a third party app market data provider. It is integrated with iTunes Connect and basically provides the same metrics as iTunes Connect and App Store, but with some estimation and better visualization. We've seen some discrepancies between the numbers provided by iTunes Connect and App Annie. Some can be explained by that App Annie's data is based on Pacific Time.
 * App Analytics only displays data from devices running iOS 8 or tvOS 9, or later. Sales and Trends displays all sales data from devices running iOS, tvOS, and macOS.
 * App Analytics data is based on Coordinated Universal Time (UTC). By default, Sales and Trends data is shown in Coordinated Universal Time (UTC), but users can change the time zone to Pacific Standard Time (PST).
 * iTunes Connect - Measure App Performance
 * iTunes Connect - Sales and Trends Guide
 * iTunes Connect - App Analytics Guide

Piwik
Matomo, formerly Piwik, is a free and open source web analytics application that runs on a PHP/MySQL webserver. The Wikipedia iOS app started to use Piwik to collect user interaction events since 2015. However, our Piwik server cannot handle the amount of traffic from iOS Wikipedia app and has failed several times. We are planing to stop using Piwik in the future.

Link to the dashboard: https://piwik.wikimedia.org/index.php?module=CoreHome&action=index&idSite=3&period=day&date=yesterday

Please note that the probability of an event actually being sent to the Piwik server is 1:10. Sampling at the event level prevent us from analyzing user conversion.

Others

 * Reading infrastructure team set up several reading lists tables in the wikishared database (schema).

Metrics
Among all the metrics listed below, the followings are the core metrics for the iOS Wikipedia app that WMF has been tracking on a regular basis since around 2015, reported in the Readers team's quarterly metrics meetings and (partially) in the Audiences department's monthly readers metrics update: daily pageviews, daily installs (aka. app units), 7-day retention, ratings, pageviews per session, session length and sessions per user.

Browse and Install
Useful ratio:
 * Conversion Rate: The ratio of app units to impression or product page views. Depending on your purpose, either the count of views or unique devices can be the denominator.

Usage
Useful ratio:
 * Daily Active Devices / 30 Day Active Devices: The ratio of Daily Active Devices to 30 Day (roughly monthly) Active Devices displays an engagement level of the active user base . This ratio reveals how much of our monthly active user base checks in on a daily basis.
 * Sessions per Active Device: The ratio of Daily Sessions to Daily Active Devices is used to understand the number of sessions an average active device launches each day. We also reported sessions per user in mobile apps session metrics. However, it is based on users who have at least one pageviews (thus the  -- the number of users -- is less than the number reported by mobile_apps_uniques), and a session is defined as a sequence of pageviews from the same app ID that does not exceed 30 minutes of inactivity.
 * Pageviews per session: Reported in mobile apps session metrics. This is based on users who have at least one pageviews and a session is defined as a sequence of pageviews from the same app ID that does not exceed 30 minutes of inactivity.