User:KZimmerman (WMF)/BUOD/status

= Better Use of Data: Status Updates = A more reliable, efficient, and accessible means of collecting, interpreting, and sharing data

Program page

Activity type: Programmatic activities

Teams contributing to the program: All Audiences teams, with particular focus by product managers and product analysts, and in partnership with Analytics Engineering.

Goals, Outcomes, and Outputs for Better Use of Data

 * Goal: make the use of quantitative data for decision making and communication a more effective and integral part of our department’s systems and processes.
 * Completing this program will result in:
 * More evidence based decision making at a feature team level
 * A better check on key indicators at the system level
 * More cost effective analysis and sharing of data

{| class="wikitable" ! colspan="2" |

Outcome 1: Assess and communicate needs

 * colspan="2" |The Technology team, and particularly Analytics Engineering will have a clear understanding of the data collection, storage, analysis and communication needs of the Audiences department, and the two departments will have improved mutual understanding of which teams will work on these areas in the future.
 * colspan="2" |Output 1.1: Data consumer gap analysis
 * Identify the set of challenges that impede the Audiences department from data-driven decision making.
 * List the specific needs of the Audiences team for data collection and deliverable creation, written as requirements for technology changes and additions.
 * Include needs around instrumentation, controlled experiments, data access, and visualization capabilities.
 * Identify any needed process improvements.
 * Epic Task - 1.1
 * List the specific needs of the Audiences team for data collection and deliverable creation, written as requirements for technology changes and additions.
 * Include needs around instrumentation, controlled experiments, data access, and visualization capabilities.
 * Identify any needed process improvements.
 * Epic Task - 1.1

Status. Complete for FY 2018-19.

Audiences Data Review - March Check-In Presentation

Better Use of Data Requirements

BUOD Prioritization Plan
 * colspan="2" |Output 1.2: Reporting technology evaluation
 * Assemble a document with a deep dive into the visualization capabilities needed by Audiences product managers and product analysts, evaluating different technology options and their pros and cons.
 * Epic Task - 1.2
 * Assemble a document with a deep dive into the visualization capabilities needed by Audiences product managers and product analysts, evaluating different technology options and their pros and cons.
 * Epic Task - 1.2

Status: Complete for FY 2018-19.

Per program page: Better Use of Data memo discussed between Audiences management and Analytics Engineering - figuring out division of labor, timelines, etc. Audiences is standardizing for FY 2018-2019 on Superset, Jupyter notebooks (SWAP/PAWS), and Turnilo.

Decision made in program monthly meeting: Stay with Superset and Turnilo for FY 2018-19. At the same time, experimenting with another tool for exploration is okay. ! colspan="2" |

Outcome 2: Define responsibilities

 * colspan="2" |The human processes that are critical to data-driven decision making will have clearly defined owners and participants, helping ensure that all measurement priorities are accomplished efficiently and without confusion. This includes clear roles and responsibilities for the reporting of program metrics, as well as the cross-team stewardship of data policies.
 * colspan="2" |Output 2.1: Measurement expectations
 * Audiences management will work with each product team to agree on the specific metrics that they should be reporting on, so that all high-level health metrics and granular project metrics are tracked and surfaced to stakeholders.
 * To support this, Audiences will develop and deploy a training curriculum on data best practices.
 * Management will set expectations of how Audiences teams should use data for reporting and decision making.
 * Epic Task
 * Audiences management will work with each product team to agree on the specific metrics that they should be reporting on, so that all high-level health metrics and granular project metrics are tracked and surfaced to stakeholders.
 * To support this, Audiences will develop and deploy a training curriculum on data best practices.
 * Management will set expectations of how Audiences teams should use data for reporting and decision making.
 * Epic Task

Status: Complete for FY 2018-19.

Marshall completed an initial review of annual plan metrics and needs with PMs. See Audiences Data Review May 2018

Jon Katz and Kate Zimmerman completed training with product managers and analysts as of January 2019.

Part I: Intro and using data to inform strategy: Video and Slides

Part II: Setting a metric and working with product analytics: Video and Slides
 * colspan="2" |Output 2.2: Data stewardship
 * Responsibility for data-related policies and decisions is currently distributed and unclear, causing delays and conflict in measurement processes. Using a DACI model, Audiences will identify roles and/or create working groups to own responsibility for the following data policies and decision areas:
 * Responsibility for data-related policies and decisions is currently distributed and unclear, causing delays and conflict in measurement processes. Using a DACI model, Audiences will identify roles and/or create working groups to own responsibility for the following data policies and decision areas:
 * Responsibility for data-related policies and decisions is currently distributed and unclear, causing delays and conflict in measurement processes. Using a DACI model, Audiences will identify roles and/or create working groups to own responsibility for the following data policies and decision areas:


 * Definitions: ensuring consistency around the specific definitions of our most important metrics.
 * Usage: ensuring that data is documented, labeled, described, and stored such that it can be used by those that need it.
 * Quality: ensuring that our most widely used datasets are of consistent quality for their multiple uses.
 * Governance: ensuring that data elements are accessible by the appropriate people.
 * Privacy: ensuring that data is collected and used in ways that are compliant with our policies.
 * Epic Task

Status: In progress.

Data Dictionary MediaWiki Page [DRAFT]


 * Goal: Audiences core metrics and FY 2018-19 annual plan metrics defined
 * In progress: T215976

Instrumentation DACI completed. ! colspan="2" |

Outcome 3: Data collection

 * colspan="2" |Reduced cost of collecting data on program metrics and on the feature usage that supports those metrics. New features and products will have proper instrumentation from their initiation, and the data we use will be more trustworthy and have fewer caveats when analyzed and communicated.
 * colspan="2" |Output 3.1: Instrumentation
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our EventLogging instrumentation workflow.
 * colspan="2" |Output 3.1: Instrumentation
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our EventLogging instrumentation workflow.
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our EventLogging instrumentation workflow.


 * This group will address challenges around front-end instrumentation, data storage, quality control processes, and ease of use.
 * Teams will be able to measure new features more quickly, independently, and reliably.
 * Epic Task

Status: In progress.

Jason conducting Conduct Audiences teams instrumentation survey [In progress]  [ T215435]

Define cross-schema event stitching approach [Backlog] [ T205569]

Add guards for session stitching [Backlog] [ T210648]

Instrumentation DACI completed.

Project Page
 * colspan="2" |Output 3.2: Controlled experiment (A/B test) capabilities
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our ability to make scientific product decisions through controlled experiments.
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our ability to make scientific product decisions through controlled experiments.
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our ability to make scientific product decisions through controlled experiments.


 * The group will iteratively standardize the technology tools, scientific methods, and guidelines by which we can run experiments, leading to experiments becoming increasingly more common in our decision making.
 * This group may evolve from or overlap deliberately with the instrumentation group, because controlled experiments rely on instrumentation capabilities.
 * Epic Task

Status. In progress.

Jason working on conducting an Audiences teams controlled experiments (A/B testing) survey [In progress] [T215436] ! colspan="2" |

Outcome 4: Deliverable creation

 * colspan="2" |Program metrics will be more easily generated, maintained, and communicated out to stakeholders through both changes in technology and process. Product decision makers will be able to independently explore data about their products. Stakeholders will have confidence that reports reflect the information they need to know.
 * colspan="2" |Output 4.1: Report stewardship
 * Designate a steward or working group to:
 * colspan="2" |Output 4.1: Report stewardship
 * Designate a steward or working group to:
 * Designate a steward or working group to:


 * Organize legacy reports
 * Create and enforce guidelines for organization of future reports
 * Create and maintain a reporting portal where decision-makers know they can find the reports relevant to program metrics.
 * Epic Task

Status: In progress.

Currently hiring a Senior Data Analyst who will own stewardship responsibilities for reporting.

Legacy Reports: Marshall compiled a list of available dashboards and other available reports he was able to find on wiki. Legacy data reports review doc.

Megan Neisler collecting available Annual Plan metric reports and add to the MediaWiki report page [On hold pending clarification of priorities with PMs] [ T215476]

Reporting portal

Project page
 * colspan="2" |Output 4.2: Reporting technology
 * Implement reporting technology recommendations from “Outcome 1: Assess and communicate needs”, such that:
 * Implement reporting technology recommendations from “Outcome 1: Assess and communicate needs”, such that:
 * Implement reporting technology recommendations from “Outcome 1: Assess and communicate needs”, such that:


 * Different Audiences roles have the appropriate technology for their skill levels in order to generate reports
 * Reports can be updated regularly
 * Reports can be collected into accessible portals for consumption by the broader organization
 * Epic Task

Status: In progress.


 * Decision from Output 1.1 and 1.2 to continue with Superset, Jupyter notebooks (SWAP/PAWS), and Turnilo for FY 2018-19.
 * Technology maintenance: Analytics Engineering created a staging environment to test updates to Superset (T212243) and has ongoing tasks to regularly update Superset (T211706).


 * colspan="2" |Output 4.3: Wiki segmentation
 * Instead of implementing programs that attempt to affect all wikis at the same time, it is common for a given Audiences program to focus just on groups of wikis, such as mid-size wikis, or large wikis.
 * Instead of implementing programs that attempt to affect all wikis at the same time, it is common for a given Audiences program to focus just on groups of wikis, such as mid-size wikis, or large wikis.
 * Instead of implementing programs that attempt to affect all wikis at the same time, it is common for a given Audiences program to focus just on groups of wikis, such as mid-size wikis, or large wikis.


 * Given that we focus our work on groups of wikis, we should be able to report out using those groupings.
 * Output: evolving sets of segmentations that classify different wikis into groupings relevant for the Audience department's work.
 * Sets of segmentation will be used to align strategic planning, program focus, and reporting on Audiences department impact.
 * Epic task

Status: In progress.

Phase 1: Create spreadsheet of data that can be sorted [Complete].

Phase 2: Recommend a standard set of key dimensions with standard classes for each [Backlog]

Phase 3: Use some unsupervised learning to try to cluster the wikis into meaningful groups  [T203034]  [Triage]
 * }