Discovery/Status updates/2018-05-07

From mediawiki.org

This is the weekly update for the week starting 2018-05-07

Highlights[edit]

  • Map internationalization launched everywhere, and embedded maps (mapframe) are now live on 276 Wikipedias [1]
  • "Hello, my name is _____" is an in-depth blog post by Trey that was published earlier this week where he details the irony that searching for names is not always as straightforward as you might think. [2]

Discussions[edit]

Search[edit]

  • Erik updated a script that was populating lots of 500 errors in the logs [3]
  • Erik also did a lot of research to evaluate impact of adding ~2700 new shards to production cluster (there is a pdf attached to the last comment in the ticket that contains more information) [4] There is a follow-up ticket as well for the next steps [5]
  • Trey worked on the analysis config for the new Slovak stemmer that was deployed this week—but the plugin still needs to be deployed and the wikis re-indexed. [6]
  • Stas and others worked on looking up entities by external identifiers - the work is done for now, but it needs a re-index to be fully ready [7]
  • David worked on externalizing the parsing logic from SimpleKeywordFeature and FullTextQueryStringQueryBuilder and it was pushed into production in April 2018 [8]


Other Noteworthy Stuff[edit]

  • Trey's most recent updates to transliteration on the Crimean Tatar Wikipedia are live; after a year of part-time 10% project work, the transliteration infrastructure for Crimean Tatar is done and the accuracy is in the high 90% range. [9]

Did you know?[edit]

  • The English word “dove”, as the past tense of “dive”, is one of the rare cases where a conjugation has become more irregular over time. The verb “dive” picked up the strong conjugation [10] by analogy with other strong verbs, particularly “drive/drove”. [11] Going in the more typical direction of regularization, Swedish strong verbs slowly lost some of their distinctive plural forms. [12] The change started in the 16th century, and was still in progress as late as the 1940s. From the search perspective, regular forms are easier to deal with—so, way to go Swedish!

--