Wikimedia Apps/App Analytics

Background
Apps use a combination of API driven metrics (page views), ad-hoc EventLogging based user events and ad-hoc Piwik based usage tracking. Because the EventLogging interface is oriented around tracking web based events, and the creation of custom queries and dashboards, the apps teams have sought to try other solutions (Piwik, Appsee) for our specific needs.

Data In
However, there is now general consensus that we should work towards client libraries which front EventLogging as the data collection and storage layer. Android is the closest to that being the case already, but a unified definition of how app analytics should work would help both teams work towards a uniform understanding of our users, and reduce complexity of analysis and testing.

Data Out
The other shortcomings identified by the apps teams are on the data modeling and querying side. This project will not directly address that, and a parallel effort by Reading PM and Data Analysis will be needed to set up worksheets, dashboards or other retrieval and presentation of cross-app usage data.

Project Goals

 * Create a client side analytics layer for Android and iOS
 * Send user events to EventLogging
 * Track and send offline events
 * Be smart about bandwidth and battery usage
 * Respect privacy and default to anonymity
 * Be easy to add events and support new features without significant schema or EventLogging overhead
 * Provide a consistent sampling regime and test user pooling processes across clients
 * Make app usage analytics testable without significant data analyst work
 * Use consistent names and data definitions across apps

Pageview-like events:
It is not scalable to track all basic consumption in the EventLogging layer, however being able to join this information is needed to do some of the funnel and usage analyses we would like to do. For specific tests and features, when needed, how can these be tracked via EventLogging? Should this tracking be part of the client analytics layer or a separate interface/path?
 * page views
 * page previews
 * search queries

Content anonymity:

 * iOS tracks the DOMAIN but not the article for article action (view, save, share, preview, etc) to add an additional layer of content consumption anonymity. Should that change?
 * Android defaults users to being tracked, iOS defaults to not tracked (and asks users explicitly during install). This makes apples to apples comparisons more complex. Should that change?

Minimize migration path for Android

 * Android already uses EventLogging and has significant instrumentation. Although this project would ideally treat the situation as a blank slate and define a "best" solution for apps generically, we should try to minimize changes needed in Android.

Passing timestamps for events

 * One of the core differences between app analytics and web analytics is that app events may occur while the user is offline. Additionally users are senstive to battery and data usage around apps (its easy to track which apps use up your battery/bandwidth to a level not generally possible in browsers). EventLogging assumes that an event will be dispatched as soon as it occurs, and places an automatic timestamp on each event as its logged. We will need to work with Analytics to detmine how asynchronous events can best be passed to EventLogging.

A/B testing capabilities

 * Android has done some limited A/B testing, and on the web, various teams (including Reading Web and Discovery) have been doing A/B tests for some time. Ideally the app analytics layer would be aware of variant tests and able to handle the pooling and tracking of users for variants. However that could also be a follow-on to a basic analytics layer definition.

Per language and per-wiki testing capabilities

 * In addition to arbitrary test pools, we often need to test or roll out features on a per-wiki basis. Being able to limit test pools to users of certain languages and wikis would be extermely valuable in these situations.