Product Analytics

About Us
Nurturing data-informed decision-making in Product since 2018-02-01.

Our Mission & Values
We deliver quantitatively-based user insights to inform decision-making in support of Wikimedia’s strategic direction toward service and equity.

We strive to provide guidance, insights, and data that are: Ethical • Trusted • Impactful • Accessible • Inclusive • Inspired

What We Do
Product Analytics contributes to the Wikimedia Movement through our work with Product teams and departments across the Foundation.

Our responsibilities include:
 * Empowering others to make data-informed decisions through education and self-service analytics tools
 * Helping set and track goals that are achievable and measurable
 * Ensuring that Wikimedia products collect useful, high quality data without harming user privacy
 * Extracting insights through ad-hoc analyses and machine learning projects
 * Building dashboards and reports for tracking success and health metrics
 * Designing and analyzing experiments (A/B tests)
 * Developing tools and software for working with data, in collaboration with Analytics Engineering and Product teams.
 * Addressing data-related issues in collaboration with teams like Analytics Engineering, Security, and Legal

Product Team Support
Each analyst is a point person for a team, project, or program. Our goals are to maintain context and domain knowledge while also allowing for flexibility in analyst work assignments. For more information about how we work with Product teams, see Working with Product Analytics.

Teams that do not currently have an assigned point person are encouraged to submit requests through Phabricator. Depending on the team's capacity and organizational needs, we may also accept requests from others in the Wikimedia Foundation. The team reserves "10 percent time" to work on professional development.

Who is on the team
Listed alphabetically by first name within each section

Leadership

 * Kate Zimmerman, Director of Data Science
 * Ask me about: Collaborating with Product Analytics, using data to inform product and business decisions, experiment design, decision science, applied stats
 * Mikhail Popov, Data Science Manager
 * Ask me about: Collaborating with Product Analytics, R, data visualization, search & traffic logs, querying product data, statistical models, Bayesian methodology, machine learning, Better Use of Data program, Event Platform, Metrics Platform

Team Members

 * Connie Chen, Sr. Data Scientist
 * Ask me about:
 * Irene Florez, Data Scientist III
 * Ask me about:
 * Jennifer Wang, Staff Data Scientist
 * Ask me about: AHT/Comm tech metrics
 * Maya Kampurath, Analyst III
 * Ask me about:
 * Megan Neisler, Sr. Data Scientist
 * Ask me about: R, data visualization, reader metrics, technical writing
 * Morten Warncke-Wang, Staff Data Scientist
 * Ask me about: R, machine learning, spatial (geographic) models, article quality, editor/editing/newcomer metrics, prior research on Wikipedia, and perhaps also time-series modeling (forecasting)
 * Neil Shah-Quinn, Sr. Data Scientist
 * Ask me about: Python for data analysis, SWAP, editor metrics, new editor research
 * Shay Nowick, Sr. Data Scientist
 * Ask me about: Mobile metrics, Pydata and Jupyter Notebooks, cohort analysis

Honorary Members

 * Jason Linehan, Staff Software Engineer
 * Tech Lead for Metrics Platform
 * Ask me about: programming languages other than R, analytics infrastructure, randomness
 * Lauren de Lench, Sr. Technical Program Manager
 * Ask me about: team process, meetings, coordinating cross-team projects

Submitting Requests
If you'd like to request data, analysis, or advice, create a task in Phabricator or send an email to product-analytics@wikimedia.org.

Requests are reviewed by Product Analytics and inform the direction and priorities of data projects. A team member will follow up about whether we’ll be able to work on your request.

Some questions may be suited to consultation hours; see Product Analytics Consultation Hours for more information and a link to book appointments.

Provide the following information to help us prioritize and respond to your request appropriately:
 * Name for main point of contact and contact preference
 * We use Phabricator to track our work and provide progress updates. Please let us know if you would like us to follow up by other methods (e.g. email).


 * What teams or departments is this for?
 * This helps us understand who will be using the analysis.


 * What are your goals? How will you use this data or analysis?
 * This helps us understand the context and priority. What decisions do you need data to inform? Will you take different actions depending on the direction of the data? Do you want to share data publicly? Do you want to include data in a narrative or message (e.g. for PR, audience engagement, or fundraising)?


 * What are the details of your request? Include relevant timelines or deadlines
 * Is there a date after which the analysis will no longer be useful? Please provide any timeline/relevant deadlines, requested formats, examples, links to documentation, or other information that would help us understand your request.


 * Is this request urgent or time sensitive?
 * We try to reply to “Urgent” requests immediately and “Time sensitive” requests by the end of the workday. All other requests will be prioritized during our weekly triage.

Note: We use Phabricator to track our work, and by default tickets are publicly visible. If any part of your request is sensitive and should be kept confidential, let us know.

Consultation Hours
Analysts host weekly consultation hours (details). Click here to view the calendar or schedule an appointment.

Data FAQs
See meta:Research:FAQ

How to contact us

 * Contact information for team members are available on their user pages (linked above).
 * Group mailing list: product-analytics@undefinedwikimedia.org

Data references and reports

 * Comparison datasets
 * Data Dictionary (documents data sources, such as those available in Superset and Turnilo)
 * Data Glossary (definitions for core metrics)
 * A/B Testing
 * Data Products (various deliverables such as reports, analyses, and datasets)
 * Movement metrics
 * Our repository of scheduled jobs
 * ETL repository (e.g. Oozie workflows)
 * Experiment Platform draft

Guidelines and best practices

 * Data access guidelines
 * Query style guide
 * Reporting Guidelines
 * Dashboarding Guidelines
 * Tips and Tricks
 * Querying JSON-containing data (notes from Mikhail Popov on how to query JSON data with Presto)
 * Analysis gotchas (notes from Isaac Johnson on common gotchas when analyzing the Mediawiki landscape)
 * Logistic regression, multilevel models, and t-tests (a simulation study inspired by experiments in improving Wikipedia editing experience, and demonstrating multiple methodologies for analyzing data)
 * Simulation study of statistical methods for comparing group (examples and informal evaluations of various statistical significance tests for comparing observations generated from different distributions and families)
 * Using log transformations in linear regression models (notes from Mikhail Popov)
 * Caching in R (notes & best practices from Mikhail Popov)

Documentation for tools we use

 * Phabricator (managing requests and tracking work)
 * Superset (WMF internal dashboards and reports)
 * Obtaining access to Superset/Turnilo, with explanation of LDAP/Developer Account terminology
 * Turnilo (WMF internal tool for pivoting and exploring data)
 * Event Platform (Various event stream distribution and processing systems we employ at WMF)
 * Matomo/Piwik (JavaScript tracking client used for wikimediafoundation.org and other smaller-scale sites)
 * Google Search Console access

Team references

 * Product Analytics Mission and Values
 * Product Analytics Team norms
 * Working with Product Analytics
 * Chore Wheel
 * Phabricator board
 * Onboarding notes for new team members
 * Offboarding
 * Fun