Discovery/Status updates/2019-02-18

This is the weekly update for the week starting 2019-02-18

Search

 * A new Korean language analyzer has been configured for Korean-language wikis, however it won't be activated until after we finish the upgrade to Elasticsearch 6, which is ongoing.
 * SDC wanted to know if we could add in a 'inlabel search keyword' and after lots of discussion, it was merged into the new WikibaseCirrusSearch extension that has yet to be merged into the beta cluster
 * Erik and the team worked on how to measure mutation latency across the newly split elasticsearch clusters and decided that default timeout was good at 30 seconds
 * Mathew and Gehel worked on testing the spicerack elasticsearch module with quite a few patches that are linked in the ticket
 * Gehel worked on getting CI set up for search/glent (maven project) to be set up with same options that we use for search/extra
 * A bug was found where a link-breaking typo is in automatic API documentation for action=query&prop=cirrusbuilddoc, and Erik fixed it by correcting the api docs for cirrusbuilddoc
 * As we now have different APT components to differentiate the elasticsearch versions, we need to create a new component for the new version and Gehel fixed it all up
 * David worked on preparing a debian package with search plugins compatible with elastic 5.6.14 in which Gehel merged
 * Davis also did quite a bit of work to fix and add integration tests for several language analyzers
 * Erik worked on updating the ttmserver for elasticsearch 6 and removed elastic 2.x compatibility

Did you know?
Grammatical gender often confuses speakers of English and other languages without a similar system. “Why is a bridge feminine in German (Brücke ) and masculine in Spanish and French (puente & pont )?” they ask—though usually without links to Wiktionary.

Grammatical gender is really just a system of noun classes where there are two or three classes, and most things classified as male or female end up in different classes. Other languages have noun classes based on whether or not the nouns are animate, whether they are human or animal, by shape, and sometimes just arbitrarily groupings; languages can have nearly two dozen noun classes, like some of the Niger–Congo languages!

Now hold on while we veer off on a brief tangent: diminutives are words that convey a smaller, lesser, or more intimate sense of their root form. They are common in American nicknames, often showing up as a -y or -ie ending (Billy vs. Bill, Peggy vs Peg, Bobbie vs Roberta). Sometimes diminutives, especially when applied to small cute things, can become the main or only form of a word. For example, English baby from babe, or kitty from kit.

Diminutives and grammatical gender collide in German Mädchen (“girl”) which is historically from Magd (cognate with English “maid”) plus the diminutive suffix -chen; all diminutives formed with -chen have neuter gender in German. Over time, Mädchen became the predominate term for a girl, despite the fact that the word is grammatically “neuter”.

--
 * View all open tickets related to Discovery.
 * Looking to get involved? See tasks marked as Easy or volunteer needed