Analytics/Archive/Editor Engagement Vital Signs

= Goals =

The Editor Engagement program is a top strategic priority for the Foundation and has many individual initiatives. We do not, however, have a dashboard that provides consistent, explorable, and timely data on this program, which makes it difficult and time-consuming for implementors to understand the impact of their changes. A consistent dashboard would also help normalize results across products and other aspects of the program which are difficult to compare at this time.

The goal of this project is implementing such a dashboard. Research and Data will provide the definitions and SQL queries for the actual metrics.

= Detailed Tracking Links =

Development (Mingle)

Research (Trello)

= Users =

= Prioritized Use Cases =

High Priority

 * 1) As a Product Manager I want a well designed dashboard that I can use to explore the metrics listed in the New Users section
 * 2) As a Product Manager, Researcher and Analytics Developer (and just about everybody else) I need documentation about the how these metrics are calculated
 * 3) As a Product Manager, I need these metrics calculated across all Wikis (this list is accessible via this API call: http://en.wikipedia.org/w/api.php?action=sitematrix
 * 4) As a Researcher, I need to create a new table for reverts (as they happen)
 * 5) As a Product Manager, I need global aggregates of these metrics
 * 6) As a Product Manager and Researcher, I want historical data to be maintained in a format that can be queried easily (Warehouse)
 * 7) As a Product Manager, I need the data to be in tabular and graphed formats and I want to be able to download the data as a CSV or TSV

Later

 * 1) As a Product Manager I want to be able to compare arbitrary data series in a graph
 * 2) As a Manager I want some graphs to be available to WMF only
 * 3) As a Product Manager, I want geographic and other arbitrary breakdowns (such as mobile/desktop)
 * 4) As a Product Manager, I want League Tables (such as daily top-X contributors by namespace)
 * 5) As a Product Manager, I want trends, moving averages, projections and other UI candy

Non functional requirements

 * 1) All dashboards should be updated daily.
 * 2) Sizing data from the size and scale section below should be used for capacity planning. We should double this number to allow for future growth. However, it is preferable that a system that allows storage and I/O capacity to be added dynamically be used for storage.
 * 3) Dashboards should render within 2 seconds
 * This does not include query time


 * 1) Once we have signoff on the basic issues described below, all issues with dashboards should be addressed within 2 "business" days of the problem reports.
 * 2) Data be retained indefinitely

= Metrics = The minimum granularity for this data should be daily with monthly rollups.

Acquisition

 * Newly registered users

Activation

 * Live users (pending instrumentation)
 * New editors

Productivity

 * Productive new users

Retention

 * Surviving new users

Community

 * Editors
 * Active editors
 * Very active editors
 * Anonymous editors
 * Bots

Content

 * Edits
 * Uploads
 * Page creations
 * Total pages

Curation

 * Reverts
 * Deletions

= Design =

We need to pull in a designer. Dario feels that a global dashboard with a project selector to become more specific is a good model.

= Implementation Details =

Size and update rate
For the metrics in phase 1, the following tables will need to be used.

These sizes/updates are from enwiki. They should be doubled to account for all of the wikis. We should double this number again to allow for growth.