Product Analytics

About Us
Nurturing data-informed decision-making in Product since 2018-02-01.

Our Mission & Values
We deliver quantitatively-based user insights to inform decision-making within the Foundation and the Wikimedia Movement in order to support Wikimedia’s strategic direction toward service and equity.

We strive to provide guidance, insights, and data that are:

Our Work

 * Empowering others to make data-informed decisions through education and self-service analytics tools
 * Helping others set and track goals that are achievable and measurable
 * Helping set up Wikimedia products to collect useful data without harming user privacy
 * Ensuring that data collected is high-quality
 * Extracting insights from the Foundation's data repositories
 * Building dashboards and reports for tracking success and health metrics
 * Designing and analyzing experiments (A/B tests)
 * Doing ad-hoc analyses and machine learning projects
 * Developing tools and software for working with data, in collaboration with Analytics Engineering and Product Analytics Infrastructure.
 * Helping others work with teams like Analytics Engineering, Security, and Legal to address data-related issues

Product Team Support
Each analyst is a point person for a team, project, or program. Our goals are to maintain context and domain knowledge while also allowing for flexibility in analyst work assignments.

Teams that do not currently have an assigned point person are encouraged to submit requests through Phabricator. Depending on the team's capacity and organizational needs, we may also accept requests from others in the Wikimedia Foundation. The team reserves "10 percent time" to work on professional development.

Who's on the team?
Listed alphabetically by first name within each section

Leadership

 * Kate Zimmerman, Director of Data Science
 * Product Owner for Better Use of Data program
 * Ask me about: Collaborating with Product Analytics, using data to inform product and business decisions, experiment design, decision science, applied stats
 * Mikhail Popov, Data Science Manager
 * Ask me about: Collaborating with Product Analytics, R, data visualization, search & traffic logs, querying product data, statistical models, Bayesian methodology, machine learning, Better Use of Data program, Event Platform, Metrics Platform

Team Members

 * Connie Chen, Sr. Data Scientist
 * Ask me about:
 * Irene Florez, Data Scientist III
 * Ask me about:
 * Jennifer Wang, Staff Data Scientist
 * Ask me about: AHT/Comm tech metrics
 * Maya Kampurath, Analyst III
 * Ask me about:
 * Megan Neisler, Data Scientist III
 * Ask me about: R, data visualization, reader metrics, technical writing
 * Morten Warncke-Wang, Staff Data Scientist
 * Ask me about: R, machine learning, spatial (geographic) models, article quality, editor/editing/newcomer metrics, prior research on Wikipedia, and perhaps also time-series modeling (forecasting)
 * Neil Shah-Quinn, Sr. Data Scientist
 * Ask me about: Python for data analysis, SWAP, editor metrics, new editor research
 * Shay Nowick, Sr. Data Scientist
 * Ask me about: Mobile metrics, Pydata and Jupyter Notebooks, cohort analysis

Honorary Members

 * Jason Linehan, Staff Software Engineer
 * Tech Lead for Metrics Platform
 * Ask me about: programming languages other than R, analytics infrastructure, randomness
 * Lauren de Lench, Sr. Technical Program Manager
 * Ask me about: team process, meetings, coordinating cross-team projects

Submitting Requests
If you'd like to request data, analysis, or advice, create a task in Phabricator or send an email to product-analytics@wikimedia.org.

Requests are reviewed by Product Analytics and inform the direction and priorities of data projects. A team member will follow up about whether we’ll be able to work on your request.

Some questions may be suited to office hours; see Product Analytics Office Hours for more information and a link to book appointments.

Provide the following information to help us prioritize and respond to your request appropriately:
 * Name for main point of contact and contact preference
 * We use Phabricator to track our work and provide progress updates. Please let us know if you would like us to follow up by other methods (e.g. email).


 * What teams or departments is this for?
 * What are the details of your request? Include relevant timelines or deadlines
 * Are you asking about a specific metric? Is there a date after which the analysis will no longer be useful? Please provide any timeline/relevant deadlines, requested formats, examples, links to documentation, or other relevant information that would help us understand your request.


 * How will you use this data or analysis?
 * This helps us understand the context and priority. What are your goals? Will you take different actions depending on the direction of the data? Do you want to share data publicly? Do you want to include data in a narrative or message (e.g. for PR, audience engagement, or fundraising)?


 * Is this request urgent or time sensitive?
 * We try to reply to “Urgent” requests immediately and “Time sensitive” requests by the end of the workday. All other requests will be prioritized during our weekly triage.

Note: We use Phabricator to track our work, and by default tickets are publicly visible. If any part of your request is sensitive and should be kept confidential, let us know.

Office Hours
Analysts host weekly office hours (details). Click here to view the calendar or schedule an appointment.

Data FAQs
See meta:Research:FAQ

How to contact us

 * Contact information for team members are available on their user pages (linked above).
 * Group mailing list: product-analytics@undefinedwikimedia.org

Data references, best practices, and reports

 * Comparison datasets
 * Data Dictionary (documents data sources, such as those available in Superset and Turnilo)
 * Data Glossary (definitions for core metrics)
 * A/B Testing
 * Data Products (various deliverables such as reports, analyses, and datasets)
 * Analytics Infrastructure
 * ETL repository (e.g. Oozie workflows)
 * Experiment Platform draft
 * Tips and Tricks
 * Querying JSON-containing data (notes from Mikhail Popov on how to query JSON data with Presto)
 * Analysis gotchas (notes from Isaac Johnson on common gotchas when analyzing the Mediawiki landscape)
 * Logistic regression, multilevel models, and t-tests (a simulation study inspired by experiments in improving Wikipedia editing experience, and demonstrating multiple methodologies for analyzing data)
 * Simulation study of statistical methods for comparing group (examples and informal evaluations of various statistical significance tests for comparing observations generated from different distributions and families)
 * Using log transformations in linear regression models (notes from Mikhail Popov)
 * Caching in R (notes & best practices from Mikhail Popov)
 * Query style guide
 * Reporting Guidelines
 * Dashboarding Guidelines

Documentation for tools we use

 * Phabricator (managing requests and tracking work)
 * Superset (WMF internal dashboards and reports)
 * Obtaining access to Superset/Turnilo, with explanation of LDAP/Developer Account terminology
 * Turnilo (WMF internal tool for pivoting and exploring data)
 * Event Platform (Various event stream distribution and processing systems we employ at WMF)
 * Matomo/Piwik (JavaScript tracking client used for wikimediafoundation.org and other smaller-scale sites)

Team references

 * Product Analytics Team norms
 * Our repository of scheduled jobs
 * Working with Product Analytics
 * Chore Wheel
 * Phabricator board
 * Onboarding notes for new team members
 * Offboarding
 * Movement metrics
 * Data access guidelines
 * Fun