Discovery/Status updates/2017-09-18


This is the weekly update for the week starting 2017-09-18


  • The explore similar language links A/B test has been completed and analysis has been done. Unfortunately, we only had one clickthrough to an article written in a different language (which was displayed in the new language links) as the report documents. We will not be going forward with this feature.
    • However, if a user wants to have the language link script added to their logged-in account, please follow the instructions here.
    • The full explore similar script (displays related articles, categories and language links) can also be enabled for logged-in users, see this link for instructions.
  • This latest A/B test (as noted directly above) effectively closes out the additional features that the Discovery Department were exploring to possibly add to the search engine results page (SERP); additional details can be read here; overall A/B testing details can be found here and self-guided testing instructions here .



  • After successfully testing and deploying the machine learning to rank model on English Wikipedia (task T175772), we have deployed a new test out to 18 other wikis that have >1% of traffic (task T175771) this week.
  • For the relevance survey, Erik developed backend infrastructure to support lots of queries and lots of results per query (task T174387) and the third running of the test was turned off this week (task T175047), analysis will be detailed in (task T174106)
  • The Chinese wiki was re-indexed (task T173464), allowing multi-hyphen tokens to be enabled in production (task T172653)
  • The Hebrew language wikis were also re-indexed (task T167058) and the HebMorph plugin was also deployed (task T167057)
  • We updated Vagrant to include new language plugins (Polish, Ukrainian, Chinese and Hebrew) (task T164367)
  • After some exhaustive investigation, we've resolved the recent load spikes on the elasticsearch cluster in eqiad (task T169498)
  • Jan has nearly finished the first Selenium test re-written from Ruby to Node.js and has learned a lot in the process. This first test will help to pave the way forward for the rest of the tests that will need to be re-written (task T174103)
  • We've completed testing for adding support of interleaved search results (task T150032) and currently wrapping up the analysis of the test (task T171215)
  • Fixed an issue with using mixed versions of the ltr plugin being deployed on elastic1020 (task T175951)
  • Erik created a few bash scripts to send from terbium when reindexing the default namespaces (task T176397) (which were moved from general to content indices); this will go into effect when we reindex the wikis again (task T147505)
  • The second running of the explore similar A/B test for language links was completed on Thursday (task T175649) and analysis is complete (task T175650); the report is online.


  • Chelsy finalized her work of creating a (mostly) automated and parameterized report template for the Search Platform teams's A/B tests (task T131795)
  • Chelsy also completed some additional API usage break out (internal vs external) on the metrics dashboard (task T172452)
  • Chelsy also finalized a new method to keep data longer (that isn't in a dashboard) by adding reports into golden (/srv/published-datasets/discovery) (task T172453)
  • Mikhail created a dashboard to track the prevalence of sister project search results on fulltext search result pages on desktop, broken up by language. For example, it turns out that nearly 80% of fulltext searches show sister projects on enwiki.


  • Jan has been working on updating the Wikipedia portal, to adjust the languages used for Chinese translations (task T171647)


  • Gehel cleared up some vm space on Horizon by deleting 4 unused maps-team instances (task T175998)
  • The map service has been upgraded to Node.js 6.11 (task T171707)
  • Map traffic has been enabled for active / active service (serving map tiles from both data centers) (task T162362)