Discovery Analysis

The analysis group within Discovery manages the Discovery Dashboards, provides analytics support to other teams when they do A/B tests, and performs ad-hoc analyses of Discovery-domain data for product managers and executive staff.

Current work
The Discovery Department's Analysis Team leads the Discovery Department in experimental design, data collection, and data analysis. In light of reduced capacity due to the departure of one of our two analysts, this quarter the team will not have its own goals, instead allocating itself as required to achieve the goals for Discovery listed in the other subsections.

Past analyses

 * Analysis: What kind of things are people doing with WDQS? (original ticket completed in Oct 2015)
 * Analysis of JavaScript support on Wikipedia Portal
 * Presentation on Browsers Geography and JavaScript Support on Wikipedia Portal
 * Analysis: Referrers of the10% of traffic to Wikipedia Portal that is referred by something other than a search engine
 * Analysis: Assessment of Portal update and its impact on search rate post-deployment
 * Analysis of clickthrough rates, section usage, and language preferences of Wikipedia Portal visitors
 * Analysis of language detection via Accept-Language Header A/B test, language switching A/B test, and second language switching A/B test
 * Analysis of first Portal A/B test
 * Analysis of query features and zero results rate using variable importance and random forest classification.
 * Analysis of Cirrus Search TextCat AB Test - Language Detection on English, French, Spanish, Italian, and German Wikipedias
 * Analysis: Who are our WDQS users and where are they from? (original ticket completed in Oct 2016)
 * Analysis: Comparing Google referred demographics - pageviews with query data in referrer header vs without

For more information about how our data analysis team is working on the Wikipedia Portal project please refer to: Wikipedia Portal experiments. For more information about how data analysis team is working to assess whether users are satisfied with the the results when searching please refer to user satisfaction research.

Selected contributions

 * urltools package for elegantly handling and parsing URLs from within R
 * BCDA R package for Bayesian analysis of 2x2 contingency tables (e.g. clickthroughs & abandonments in A/B tests) and the testr R package for Bayesian analysis of A/B tests
 * reconstructr package for session reconstruction and analysis in R
 * WikipediR, an R API wrapper for MediaWiki, optimised for the Wikimedia Foundation MediaWiki instances, such as Wikipedia