Wikimedia Discovery

From MediaWiki.org
(Redirected from Search and Discovery)
Jump to: navigation, search

The Discovery Department of Wikimedia Engineering has the mission to make the wealth of knowledge and content in the Wikimedia projects easily discoverable. We have a number of projects detailed below that focus us on creating and supporting new forms of discovery.

Projects[edit]

Search[edit]

Discovery is responsible for maintaining and enhancing the various Search features and APIs for MediaWiki. This includes the CirrusSearch extension which relies on, the search backend used at the Wikimedia Foundation to support Wikimedia projects.

Learn more about Search and the current work of the team.

Current work by this team is tracked on this Phabricator workboard.

Search Analytics Dashboard - Public dashboard to monitor and analyze the impact of our efforts.

Current Goals (FY 2016-17 Q3)[edit]

Wikipedia.org portal[edit]

Many people discover Wikipedia via https://www.wikipedia.org/ (roughly 1.5-2% of our total page views). The Discovery team is looking at how to improve the user experience for these visitors. Here is an initial analysis from the Discovery team.

Learn more about the work around the Wikipedia.org portal project.

Current work by this team is tracked on this Phabricator workboard and a listing of upcoming A/B tests can be found here.

Portal Analytics Dashboard - Public dashboard to monitor and analyze the impact of our efforts.

External Search Traffic - External search engines metric that very broadly looks at where our requests are coming from.

Current Goals (FY 2016-17 Q3)[edit]

  • No dedicated goals this quarter.

Maps[edit]

Discovery is about finding and navigating to content, and one way for users to do that is via maps. To provide better maps the team is working to make OpenStreetMap tiles available on all Wikimedia projects. The technical challenge is doing so at a scale sufficient for their widespread usage.

Learn more about the Maps project.

Work by this team is tracked on this Phabricator workboard

The team's roadmap can be viewed here; it was finalized in Nov 2016 for FY 2016/2017.

Maps Analytics Dashboard - Public dashboard to monitor and analyze the impact of our efforts.

Current Goals (FY 2016-17 Q2)[edit]

  • Increase maps and graphs usage on Wikipedia
  • Enable shareable Geoshapes and Tabular data storage on Commons

Wikidata Query Service (WDQS)[edit]

Searching structured data on Wikidata is also part of Discovery, so we are building the Wikidata query service. It provides a API through which tools can access Wikidata.

Learn more about the Wikidata query service.

Current work by this team is tracked on this Phabricator workboard

Weekly deployments of WDQS are documented on wikitech:Deployments.

WDQS Analytics Dashboard - Public dashboard to monitor and analyze the impact of our efforts.

Current Goals (FY 2016-17 Q3)[edit]

  • No dedicated goals this quarter.

Analysis[edit]

The analysis group within Discovery manages the Discovery Dashboard, as well as analyzing A/B tests and other data.

Learn more about the Discovery analysis team and even more information on how they do their analysis and the impact (on Meta).

Current work by the analysis team is tracked on this Phabricator workboard

Current Goals (FY 2016-17 Q3)[edit]

  • This quarter, instead of having team-specific goals, the analysis team will be supporting the other team's goals.

APIs[edit]

Application Programming Interfaces (APIs) provide developers ways to interact with the MediaWiki software.

API:Search and discovery lists the search APIs available and in development.

API Analytics Dashboard - Public dashboard to monitor and analyze the impact of our efforts.

Other[edit]

For general questions about the work of the Discovery department, please see the FAQ.

An overview of Discovery's narrative and roadmap for FY 2016/17 (July 2016 - June 2017)

For any questions about the term "Knowledge Engine" please refer to this FAQ.

You can find all of our data and key performance indicators on our data portal.

The team[edit]

Below is a list of sub-teams in the Discovery Department. This list was last updated on 20th January 2017.

Each sub-team lists the names and team roles (not job titles; those are listed in the staff and contractors page, and may or may not be the same as the person's team role) of anyone who spends a not insignificant amount of time on a project; this therefore means that some names are duplicated across teams.

These lists are only intended to roughly convey who is working on what; no guarantees are made that the list is accurate to any particular level of detail. If you have questions, please contact Dan Garry.

Search: Backend[edit]

Search: Frontend[edit]

Wikipedia Portal[edit]

Maps[edit]

Wikidata Query Service[edit]

Analysis[edit]

Cross-team support[edit]

Communications[edit]

See Updates below for Discovery weekly status updates

Mailing lists[edit]

Discovery - A public mailing list about Wikimedia Discovery projects. Examples of topics would include:

  • Announcements, including major upcoming initiatives, completed major releases, quarterly or annual plans, requests for feedback or input
  • Technical discussions and brainstorming regarding our work:
    • Search, Elastic, Cirrus, the Relevance Forge, and other relevant subjects
    • The portal and associated work
    • Our dashboards or related analysis
    • Note that there is a separate list for maps (below)
  • Departmental news, such as changes to team structure, significant changes to team process, changes in how we use phabricator or other tools like gerrit

Maps - Discussion and development coordinating the integration of OpenStreetMap and other free map sources into Wikimedia projects.

IRC channels[edit]

#wikimedia-discoveryconnect

#wikimedia-interactiveconnect - for talking all Interactive Wikimedia projects - maps, graphs, etc.

Twitter[edit]

https://twitter.com/WMF_Discovery

Meetup groups[edit]

Conferences, gatherings, and other events[edit]

Upcoming events[edit]

  • csv,conf,v3 - A community conference for data makers everywhere - May 2-3 2017, Portland, OR
  • Discovery offsite - May 15-17, 2017
  • Hackathon - May 19 - 20, 2017

Past events[edit]

Updates[edit]

Weekly Discovery status updates[edit]

See Discovery/Status updates for the archive of past Discovery updates

This is the weekly update for the week starting 2017-01-02

Highlights

  • Secondary search results are now possible over the API! This is currently used to suggest pages from other language projects that might be relevant for users. Please review the examples for more information. (T142795)
  • Added cool new to the Metrics dashboard that highlight the selected date range
  • Added a full geographic breakdown to the Portal dashboard to see traffic and clickthrough rates on Wikipedia.org portal across all countries
  • Added final results of the successful long term testing of the addition of app and app page links to wikipedia.org portal page
  • OSM now allows users who have a Wikimedia account to log into OSM and edit the map data, also new guide added [1].
  • Extensions now can define search keywords using CirrusSearchAddQueryFeatures hook.

Search[edit]

  • Secondary search results are now possible over the API! This is currently used to suggest pages from other language projects that might be relevant for users. Please review the examples for more information. (T142795)
  • Fixed a typo in the search preferences page (T154532)
  • Added support for extensions to register keywords with the search system (T152517)
  • Migrated some keywords to Extension:GeoData to test keyword registration support (T152730)

Analysis[edit]

Portal[edit]

  • Updated the stats on Wikipedia.org to reflect that a couple wiki's that have recently achieved big goals in stats: Maithili Wikipedia is now over 10,000 articles and Hebrew Wikipedia now has over 200,000 articles.
  • Closed out a long term test for adding app and app page links to the wikipedia.org portal page

Interactive[edit]


--

Meeting minutes[edit]

Wikimedia Discovery/Meetings

Quarterly reviews[edit]

Data Analysis[edit]

The data access and analysis guidelines used by the Discovery team around data sources, or by other teams around Discovery data sources, are documented on Meta.

Deployers[edit]

Useful reference for who can deploy code. It's nice to know whom to bug if you need something:

Person MediaWiki

Deployer

Elasticsearch

Deployer

Maps

Deployer

Graphoid

Deployer

Portals Deployer
Deskana
dcausse YesY
ebernhardsen YesY YesY
jan_drewniak YesY
jgirault YesY
MaxSem YesY YesY
SMalyshev
yurik YesY YesY YesY
gehel YesY
^d YesY YesY

Code[edit]

Discovery team supports the following code:

Repository Phabricator/Diffusion Github mirror Active?
CirrusSearch extension https://phabricator.wikimedia.org/diffusion/ECIR/ wikimedia/mediawiki-extensions-CirrusSearch
Elastica extension https://phabricator.wikimedia.org/diffusion/EELA/ wikimedia/mediawiki-extensions-Elastica
GeoData extension https://phabricator.wikimedia.org/diffusion/EGDA/ wikimedia/mediawiki-extensions-GeoData
Wikidata Query Service https://phabricator.wikimedia.org/diffusion/WDQR/ wikimedia/wikidata-query-rdf
Wikidata Query Service GUI https://phabricator.wikimedia.org/diffusion/WDQG/ wikimedia/wikidata-query-gui
WDQS deployment https://phabricator.wikimedia.org/diffusion/WDQD/ wikimedia/wikidata-query-deploy
WDQS GUI deployment wikimedia/wikidata-query-gui-deploy
Wikimedia Portals https://phabricator.wikimedia.org/diffusion/WPOR/ wikimedia/portals
PHP textcat https://phabricator.wikimedia.org/diffusion/WTEX/ wikimedia/wikimedia-textcat
Relevance Forge wikimedia/wikimedia-discovery-relevanceForge
Discernatron wikimedia/wikimedia-discovery-discernatron
Discovery Analytics https://phabricator.wikimedia.org/diffusion/WDAN/ wikimedia/wikimedia-discovery-analytics