User:KZimmerman (WMF)/BUOD/status

Better Use of Data
A more reliable, efficient and accessible means of collecting, interpreting and sharing data

Program page

Activity type: Programmatic activities

Teams contributing to the program: All Audiences teams, with particular focus by product managers and product analysts, and in partnership with Analytics Engineering.

Goals, Outcomes, and Outputs for Better Use of Data
Annual Plan FY18-19 topline goals: Knowledge as a Service/Foundational Strength – evolve our systems and structures

How does your program affect the annual plan topline goal? ''There is widespread desire among both the teams and their stakeholders to have more data, and to make better use of quantitative data in decision making and communication. This program will make improvements to our systems and structures for the effective collection, storage, analysis and sharing of data a top level goal across the Audiences department.''

Program Goal: ''This program’s goal is to make the use of quantitative data for decision making and communication a more effective and integral part of our department’s systems and processes. Completing this program will result in more evidence based decision making at a feature team level, a better check on key indicators at the system level and much more cost effective analysis and sharing of data.''

{| class="wikitable" ! colspan="2" |

Outcome 1: Assess and communicate needs

 * colspan="2" |The Technology team, and particularly Analytics Engineering will have a clear understanding of the data collection, storage, analysis and communication needs of the Audiences department, and the two departments will have improved mutual understanding of which teams will work on these areas in the future.
 * colspan="2" |Output 1.1: Data consumer gap analysis
 * Identify the set of challenges that impede the Audiences department from data-driven decision making. List the specific needs of the Audiences team for data collection and deliverable creation, written as requirements for technology changes and additions.  Include needs around instrumentation, controlled experiments, data access, and visualization capabilities.  In addition to technology changes, also identify any needed process improvements.
 * Epic Task - 1.1
 * Identify the set of challenges that impede the Audiences department from data-driven decision making. List the specific needs of the Audiences team for data collection and deliverable creation, written as requirements for technology changes and additions.  Include needs around instrumentation, controlled experiments, data access, and visualization capabilities.  In addition to technology changes, also identify any needed process improvements.
 * Epic Task - 1.1
 * Epic Task - 1.1

Status. Complete[?].

Marshall Presentation: Audiences Data Review - March Check-In Presentation

BUOD Memo [?]:

BUOD Prioritization Plan


 * Output 1.2: Reporting technology evaluation
 * Assemble a document with a deep dive into the visualization capabilities needed by Audiences product managers and product analysts, evaluating different technology options and their pros and cons.
 * Epic Task - 1.2
 * Assemble a document with a deep dive into the visualization capabilities needed by Audiences product managers and product analysts, evaluating different technology options and their pros and cons.
 * Epic Task - 1.2
 * Epic Task - 1.2

Status: Complete[?]

Per program page: Better Use of Data memo discussed between Audiences management and Analytics Engineering - figuring out division of labor, timelines, etc. Audiences is standardizing for FY 2018-2019 on Superset, Jupyter notebooks (SWAP/PAWS), and Turnilo.

Decision made in program monthly meeting: Was to be the final thing that MM was supposed to work on TN: we either need to figure out who and where can cover this or proceed in pieces with help from MM over coming months

Josh: was this informational or a decision point?

MM- identify shortcomings and identify which solutions address these shortfalls

Josh- better use of our time to document what are needs are on a use case level

JK does not want to stay in limbo much longer given pressure on him from the Analysts

AB requests using a tool that is maintained

The Marshall Memo includes an evaluation of reporting tools

TN: is there anyone other than Marshall who has time and expertise to make a recommendation?

MM: more about time than expertise

JK: possibly Neil

TN: DECISION stay with Superset and Turnilo for this fiscal

At the same time, experimenting with another tool for exploration is okay !

Outcome 2: Define responsibilities
!
 * colspan="2" |The human processes that are critical to data-driven decision making will have clearly defined owners and participants, helping ensure that all measurement priorities are accomplished efficiently and without confusion. This includes clear roles and responsibilities for the reporting of program metrics, as well as the cross-team stewardship of data policies.
 * Output 2.1: Measurement expectations
 * Audiences management will work with each product team to agree on the specific metrics that they should be reporting on, so that all high-level health metrics and granular project metrics are tracked and surfaced to stakeholders. To support this, Audiences will develop and deploy a training curriculum on data best practices, and management will set expectations of how Audiences teams should use data for reporting and decision making.
 * Epic Task
 * Audiences management will work with each product team to agree on the specific metrics that they should be reporting on, so that all high-level health metrics and granular project metrics are tracked and surfaced to stakeholders. To support this, Audiences will develop and deploy a training curriculum on data best practices, and management will set expectations of how Audiences teams should use data for reporting and decision making.
 * Epic Task
 * Audiences management will work with each product team to agree on the specific metrics that they should be reporting on, so that all high-level health metrics and granular project metrics are tracked and surfaced to stakeholders. To support this, Audiences will develop and deploy a training curriculum on data best practices, and management will set expectations of how Audiences teams should use data for reporting and decision making.
 * Epic Task

Status:

Marshall completed an initial review of annual plan metrics and needs with PMs. See Audiences Data Review May 2018
 * Output 2.2: Data stewardship
 * Responsibility for data-related policies and decisions is currently distributed and unclear, causing delays and conflict in measurement processes. Using a DACI model, Audiences will identify roles and/or create working groups to own responsibility for the following data policies and decision areas:
 * Responsibility for data-related policies and decisions is currently distributed and unclear, causing delays and conflict in measurement processes. Using a DACI model, Audiences will identify roles and/or create working groups to own responsibility for the following data policies and decision areas:
 * Responsibility for data-related policies and decisions is currently distributed and unclear, causing delays and conflict in measurement processes. Using a DACI model, Audiences will identify roles and/or create working groups to own responsibility for the following data policies and decision areas:
 * Responsibility for data-related policies and decisions is currently distributed and unclear, causing delays and conflict in measurement processes. Using a DACI model, Audiences will identify roles and/or create working groups to own responsibility for the following data policies and decision areas:


 * Definitions: ensuring consistency around the specific definitions of our most important metrics.
 * Usage: ensuring that data is documented, labeled, described, and stored such that it can be used by those that need it.
 * Quality: ensuring that our most widely used datasets are of consistent quality for their multiple uses.
 * Governance: ensuring that data elements are accessible by the appropriate people.
 * Privacy: ensuring that data is collected and used in ways that are compliant with our policies.
 * Epic Task

Status: In progress.

Tilman working on building a data dictionary with Audiences core metrics and 2018/19 annual plan metrics defined. [ T215976]

Data Dictionary MediaWiki Page [DRAFT]

Instrumentation DACI completed. !

Outcome 3: Data collection
!
 * Reduced cost of collecting data on program metrics and on the feature usage that supports those metrics. New features and products will have proper instrumentation from their initiation, and the data we use will be more trustworthy and have fewer caveats when analyzed and communicated.
 * Output 3.1: Instrumentation
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our EventLogging instrumentation workflow. This group will address challenges around front-end instrumentation, data storage, quality control processes, and ease of use.  Teams will be able to measure new features more quickly, independently, and reliably.
 * Epic Task
 * Output 3.1: Instrumentation
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our EventLogging instrumentation workflow. This group will address challenges around front-end instrumentation, data storage, quality control processes, and ease of use.  Teams will be able to measure new features more quickly, independently, and reliably.
 * Epic Task
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our EventLogging instrumentation workflow. This group will address challenges around front-end instrumentation, data storage, quality control processes, and ease of use.  Teams will be able to measure new features more quickly, independently, and reliably.
 * Epic Task

Status: In progress.

Jason conducting Conduct Audiences teams instrumentation survey [In progress]  [ T215435]

Define cross-schema event stitching approach [Backlog] [ T205569]

Add guards for session stitching [Backlog] [ T210648]

Instrumentation DACI completed.

Project Page
 * Output 3.2: Controlled experiment (A/B test) capabilities
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our ability to make scientific product decisions through controlled experiments. The group will iteratively standardize the technology tools, scientific methods, and guidelines by which we can run experiments, leading to experiments becoming increasingly more common in our decision making.
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our ability to make scientific product decisions through controlled experiments. The group will iteratively standardize the technology tools, scientific methods, and guidelines by which we can run experiments, leading to experiments becoming increasingly more common in our decision making.
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our ability to make scientific product decisions through controlled experiments. The group will iteratively standardize the technology tools, scientific methods, and guidelines by which we can run experiments, leading to experiments becoming increasingly more common in our decision making.
 * Initiate and proceed with a cross-departmental working group that makes concerted improvements to our ability to make scientific product decisions through controlled experiments. The group will iteratively standardize the technology tools, scientific methods, and guidelines by which we can run experiments, leading to experiments becoming increasingly more common in our decision making.

This group may evolve from or overlap deliberately with the instrumentation group, because controlled experiments rely on instrumentation capabilities.
 * Epic Task

Status. In progress.

Jason working on conducting an Audiences teams controlled experiments (A/B testing) survey [In progress] [T215436] !

Outcome 4: Deliverable creation
!
 * Program metrics will be more easily generated, maintained, and communicated out to stakeholders through both changes in technology and process. Product decision makers will be able to independently explore data about their products. Stakeholders will have confidence that reports reflect the information they need to know.
 * Output 4.1: Report stewardship
 * Designate a steward or working group to organize legacy reports, create and enforce guidelines for organization of future reports, and create and maintain a reporting portal where decision-makers know they can find the reports relevant to program metrics.
 * Epic Task
 * Output 4.1: Report stewardship
 * Designate a steward or working group to organize legacy reports, create and enforce guidelines for organization of future reports, and create and maintain a reporting portal where decision-makers know they can find the reports relevant to program metrics.
 * Epic Task
 * Designate a steward or working group to organize legacy reports, create and enforce guidelines for organization of future reports, and create and maintain a reporting portal where decision-makers know they can find the reports relevant to program metrics.
 * Epic Task

Status:

Legacy Reports: Marshall compiled a list of available dashboards and other available reports he was able to find on wiki. Legacy data reports review doc.

Megan collecting available Annual Plan metric reports and add to the MediaWiki report page [On hold pending clarification of priorities with PMs] [ T215476]

Reporting portal

Project page
 * Output 4.2: Reporting technology
 * Implement reporting technology recommendations from “Outcome 1: Assess and communicate needs”, such that different Audiences roles have the appropriate technology for their skill levels in order to generate reports, reports can be updated regularly, and can be collected into accessible portals for consumption by the broader organization.
 * Epic Task
 * Implement reporting technology recommendations from “Outcome 1: Assess and communicate needs”, such that different Audiences roles have the appropriate technology for their skill levels in order to generate reports, reports can be updated regularly, and can be collected into accessible portals for consumption by the broader organization.
 * Epic Task
 * Epic Task

Status: [?] Decision from Output 1.1 and 1.2 to continue with Superset and Turnilo for this fiscal year.
 * Output 4.3: Wiki segmentation
 * Instead of implementing programs that attempt to affect all wikis at the same time, it is common for a given Audiences program to focus just on groups of wikis, such as mid-size wikis, or large wikis. Given that we focus our work on groups of wikis, we should be able to report out using those groupings. The output here are evolving sets of segmentations that classify different wikis into groupings relevant for the Audience department's work.  These will be used to align strategic planning, program focus, and reporting on Audiences department impact -- making it possible to report out using the same groupings as we use in our daily work.
 * Epic task.
 * Instead of implementing programs that attempt to affect all wikis at the same time, it is common for a given Audiences program to focus just on groups of wikis, such as mid-size wikis, or large wikis. Given that we focus our work on groups of wikis, we should be able to report out using those groupings. The output here are evolving sets of segmentations that classify different wikis into groupings relevant for the Audience department's work.  These will be used to align strategic planning, program focus, and reporting on Audiences department impact -- making it possible to report out using the same groupings as we use in our daily work.
 * Epic task.
 * Epic task.

Status: In progress.

Phase 1: Create spreadsheet of data that can be sorted [Complete].

Phase 2: Recommend a standard set of key dimensions with standard classes for each [Backlog]

Phase 3: Use some unsupervised learning to try to cluster the wikis into meaningful groups  [T203034]  [Triage]
 * }