Discovery/Status updates/Next

Jump to: navigation, search

This is the weekly update for the week starting 2018-01-08


  • Latvian and Arabic Wikipedia's enabled mapframe on their respective wiki's (completed by community volunteers) [1] [2]



  • David tuned the wikidata fulltext search similarity parameters [3]
  • Stas fixed an issue with the IDs option in forceSearchIndex.php is broken [4]
  • Trey finished his initial analysis of phonetic algorithms available in Elasticsearch. [5]


  • The portal's stats and translations were updated on January 8, 2018 via our mostly automated process [6] [7]
  • Jan updated "liguru" to "lìgure" on the Wikipedia portal [8]


  • Something goes here

Events and News[edit]

  • Something goes here

Other Noteworthy Stuff[edit]

  • Something goes here

Did you know?[edit]

  • Last October, the President of Kazakhstan announced that the country would switch from the Cyrillic to Latin alphabet.[9] As a result, over the course of about 100 years, the writing system for Kazakh will have changed from Arabic to Latin to Cyrillic and back to Latin. Several Turkic languages[10] spoken in former Soviet Republics have gone through similar shifts, including Azerbaijani,[11] Turkmen,[12] and Uzbek[13]—with the most recent shift to Latin for some beginning in the 1990s. Thus some speakers of those languages lived through all three changes to their official writing system.


  • Something goes here

FY 2017-18 Q3 (Jan-Mar) goals[edit]

This status was last updated 2018-01-08.


Current Goals (FY 2017-18 Q3)

  • Objective 1: Implement advanced methodologies such as “learning to rank” machine learning techniques and signals to improve search result relevance across language Wikipedias.
    • Create and test advanced parser features
    • Evaluate and build new features for machine learning pipeline (T162279)
    • Begin to build relationships with external information retrieval researchers
    • Category search (keywords for sub-category searching)
  • Objective 2: Improve support for multiple languages by researching and deploying new language analyzers as they make sense to individual language wikis.
    • Continue to investigate morphological libraries for ElasticSearch plugins.
      • Implement Serbian, investigate Slovak
    • Improve search by using fuzzy (phonetic) language matching.
    • Continue general language support.
      • Investigate language analyzer config options
  • Objective 3: Investigate how to expand and scale Wikidata Query Service to improve its ability to power features on-wiki for readers
    • Acquire and productionize six new servers for WDQS, (see T178548)
    • Set up individual internal and external service endpoints with enhanced features for expert users
  • Address technical debt:
    • Elasticsearch 5.6/Logstash 5.6/Kibana 5.6 (ELK stack)
    • Maintain APIs
    • Translation extension

Structured Data on Commons[edit]

Current Goals (FY 2017-18 Q3)

  • Objective 1: Commons search will be extended via CirrusSearch and ElasticSearch and Wikidata Query Service, to support searching based on structured data elements describing media.
    • Search for file captions, including multilinguality (there will be multilingual file captures, there might be file summaries, and there might be additional related functionality implied designs when received); (also general design for search on FE)
  • Objective 2: Advanced search capabilities (e.g., Wikidata Query Service, SPARQL queries) will be updated to support the more specific media search filters and the relationships to the topics they represent
    • Upgrade and re-implement full-text search on ElasticSearch on Wikidata
    • Investigate using MCR with Wikidata