Discovery Analysis

The analysis group within Discovery manages the Discovery Dashboards, provides analytics support to other teams when they do A/B tests, and performs ad-hoc analyses of Discovery-domain data for product managers and executive staff.

Current work
The Discovery Department's Analysis Team leads the Discovery Department in experimental design, data collection, and data analysis.

Past analyses

 * Analysis: What kind of things are people doing with WDQS? (original ticket completed in Oct 2015)
 * Analysis of JavaScript support on Wikipedia Portal
 * Presentation on Browsers Geography and JavaScript Support on Wikipedia Portal
 * Analysis: Referrers of the10% of traffic to Wikipedia Portal that is referred by something other than a search engine
 * Analysis: Assessment of Portal update and its impact on search rate post-deployment
 * Analysis of clickthrough rates, section usage, and language preferences of Wikipedia Portal visitors
 * Analysis of language detection via Accept-Language Header A/B test, language switching A/B test, and second language switching A/B test
 * Analysis of first Portal A/B test
 * Analysis of query features and zero results rate using variable importance and random forest classification.
 * Analysis of Cirrus Search TextCat AB Test - Language Detection on English, French, Spanish, Italian, and German Wikipedias
 * Analysis: Who are our WDQS users and where are they from? (original ticket completed in Oct 2016)
 * Analysis: Comparing Google referred demographics - pageviews with query data in referrer header vs without
 * Analysis: From Zero To Hero 2: Electric Boogaloo - or - how does stripping out question marks improve search
 * Analysis: Second BM25 Test Analysis completed
 * Analysis: Quick check on test for adding app links to portal page
 * Analysis: Dashboard update for the Wikipedia portal page by adding in the ability to display statistics by country or region
 * Analysis: Recap of recently completed work on new ReportUpdater and more

For more information about how our data analysis team is working on the Wikipedia Portal project please refer to: Wikipedia Portal experiments. For more information about how data analysis team is working to assess whether users are satisfied with the the results when searching please refer to user satisfaction research.

Selected contributions

 * urltools package for elegantly handling and parsing URLs from within R
 * reconstructr package for session reconstruction and analysis in R
 * BCDA R package for Bayesian analysis of 2x2 contingency tables (e.g. clickthroughs & abandonments in A/B tests) and the testr R package for Bayesian analysis of A/B tests
 * R wrappers for Wikimedia Foundation's APIs
 * pageviews, an R api wrapper for page view data, from entire projects to per-article levels of granularity through the RESTful API
 * WikipediR, an R API wrapper for MediaWiki, optimized for the WMF MediaWiki instances, such as Wikipedia
 * WikidataR, an R API wrapper for Wikidata
 * WikidataQueryServiceR, an R API wrapper for Wikidata Query Service SPARQL endpoint