Analytics/Epics/Pageview API

= Goals =

Wikipedians need a reliable and accurate API for querying page views for articles. This epic describes the steps that need to be taken to build such an API. Initially, this epic will focus on the underlying infrastructure (e.g. kafka/hadoop) that needs to be built for this purpose. This Epic is definitely not finished and will be expanded with more requirements about the front end as the back end work progresses.

= Detailed Tracking Links =

TBD

= Users =

= Prioritized Use Cases =

High Priority

 * 1) As a Wikipedian, I need an API that allows me to query various page view stats
 * 2) As a Reader, I want any PII (IP address, UA, etc) to be removed from my page view information
 * 3) As a Product Owner, I want page views to be geo-coded at a country level
 * 4) As a Product Owner (and a lot of other stakeholders), I want raw logs to be deleted within 90 days
 * 5) As a Product Owner, I want page views to conform to a community reviewed definition

Non functional requirements

 * 1) Data should be updated daily, with hourly granularity

= Additional information =

We've done some planning with tech-ops documented here: List of tasks for backend work