Wikimedia Discovery

The Discovery department of Wikimedia Engineering is building the anonymous path of discovery to a trusted and relevant source of knowledge. You can find all of our data and key performance indicators on our data portal.

Search
Discovery is responsible for maintaining and enhancing the various Search features and APIs for MediaWiki. This includes the CirrusSearch extension which relies on Elasticsearch, the search backend used at Wikimedia.

Current (FY 2015-16 Q1) Goals [(link)]:
 * Zero results rate cut in half, from approximately 25% to approximately 12.5%.
 * No decrease in user clickthrough rate from search results.

Current work by this team is tracked on this Phabricator [workboard].

Ops support: User:GLavagetto_(WMF)

Maps
Discovery is about finding and navigating to content, and one way for users to do that is via maps. To provide better maps, OpenStreetMap/Production maps cluster is a project to make OpenStreetMap available on all Wikimedia projects, at a scale sufficient for their widespread usage. The main sub-project page is here: Maps.

Current (FY 2015-16 Q1) Goals [(link)]:
 * Wikimedia Maps Tile Server is deployed and usable from within our cluster.
 * Define metrics and KPIs for the service.
 * Display metrics and KPIs on the Discovery Department dashboards.

Work by this team is tracked on this Phabricator [workboard]

Ops support: User:JCrespo_(WMF) and [User:Akosiaris]

Wikidata Query Service (WDQS)
Searching structured data on Wikidata is also part of Discovery, so we are building the Wikidata query service. It provides a SPARQL API through which tools can access Wikidata.

Current (FY 2015-16 Q1 Goals [(link)]:
 * Wikidata Query Service is deployed and usable from within our cluster.
 * Wikidata Query Service keeps with Wikidata update stream.
 * Define metrics and KPIs for the service.
 * Display metrics and KPIs on the Discovery Department dashboards.

Current work by this team is tracked on this Phabricator [workboard]

Ops support: User:GLavagetto_(WMF)

Wikimedia.org portal
Many users discover Wikimedia via [], so Discovery will be looking at how to improve the user experience on that page. Here is an [initial analysis].

APIs
API:Search and discovery lists the search APIs available and in development.

Members

 * Wes Moran, Vice President, Head of Discovery
 * Tomasz Finc, Director of Discovery Engineering
 * Dan Garry, Lead Product Manager
 * Moiz Syed, Design Manager
 * Oliver Keyes, Data Analyst
 * Yuri Astrakhan, Senior Software Engineer
 * Erik Bernhardson, Software Engineer
 * Stas Malyshev, Senior Performance Engineer
 * Max Semenik, Software Engineer
 * Kevin Smith, Agile Coach
 * David Causse, Software Engineer
 * Trey Jones, Software Engineer
 * Mikhail Popov, Data Scientist

Roles and responsibilities
Roles and responsibilities for team members other than developers can be found here. The short form is:


 * VP: "Strategic Vision"
 * Director: "Managing People and Coordinating w/Engineering"
 * Product Manager: "Product Vision and Story Prioritization"
 * UX Design: "UX Design and Vision, and leading UI engineers"
 * Engineering Team Lead: "Architecture and Code Quality"
 * Engineer:
 * Agile Coach: "Facilitation and Process Improvement"

We are hiring. See https://wikimediafoundation.org/wiki/Work_with_us

Mailing list
Wikimedia-search

Twitter
https://twitter.com/WMF_Discovery

Meetup groups

 * San Francisco
 * Directly relevant
 * Bay Area NLP (Natural Language Processing, not Neuro-Linguistic Programming)
 * San Francisco text
 * Elasticsearch San Francisco
 * Indirectly-related (these sorts of Meetup groups attract smart/enthusiastic people who like to spend their free time learning and solving problems)
 * Silicon Valley Java user group
 * San Francisco PHP
 * Bay Area Haskell users group
 * Scala study group
 * SF JavaScript
 * Oakland advanced Scala study group

Upcoming events

 * WikiConference USA
 * Washington DC, October 9-11, 2015
 * http://wikiconferenceusa.org/wiki/2015/Main_Page
 * At least one team member will attend. No presentations planned.

Past Events

 * State of the Map US 2015
 * June 6-8 in New York.
 * An annual conference for all OpenStreetMap users. http://stateofthemap.us/
 * Yuri & Max attended
 * OpenAir 2015
 * June 4 in San Francisco.
 * https://openair2015.com/
 * "OpenAir is the premier conference that focuses on creating engineering solutions to the challenges of matching. The brightest minds in the industry will come together to tackle such issues as search and discovery, trust, internationalization, mobile, identity, and infrastructure."
 * Hackathon 2015
 * Lyon, France, May.
 * Wikimania 2015 (July 15-19 in Mexico City)
 * Presentation (video) "Are we failing our users when they search Wikipedia?" by Dan and Moiz
 * http://wikimania2015.wikimedia.org/wiki/Main_Page
 * Smart Data Conference 2015 (August 18-20 in San Jose, CA)
 * http://smartdata2015.dataversity.net/
 * Presentation: https://www.blazegraph.com/whitepapers/TUE_0830_Haase_Peter_Norton_Barry_Vrandecic_Denny_COLOR_8067.pdf

Elasticsearch cluster

 * How Elasticsearch breaks Part 1 Part 2
 * Notes on unbreaking and optimizing elasticsearch

Meeting minutes

 * Weekly checkins:
 * Discovery/Checkin meeting minutes
 * Retrospectives
 * Team retrospective 2015-05-06
 * Lyon Hackathon
 * Team retrospective 2015-06-16
 * Team retrospective 2015-07-13
 * Team retrospective 2015-08-10
 * Quarterly reviews
 * Q4 2014-15 (2015-07-07)

Deployers
Useful reference for who can deploy code. Its nice to know whom to bug if you need something: