Discovery/Status updates/2019-08-26

This is the weekly update for the week starting 2019-08-26

Search

 * There was an older bug where unpredictable behavior with the order of Special:Search parameters was occurring - we had worked on it previously but David added a new patch to add morelikethis a non-greedy version of the morelike keyword and deployed it this week on the train
 * David and Tgr did some work on fixing where vagrant wikibase cirrus role was not working and had updated Cirrus to index P1 and P2 as statements
 * Cloudelastic jvms were suffering from weird behaviors of the GC causing slowdowns of the whole cluster and therefor slowing consumption of production MW JobQueues; it needed some alerts that Mathew and Gehel added in
 * David discovered that create_timestamp was not present on production index mappings for some wikis and fixed it
 * Several folks worked on an issue where the elasticsearch systemd unit sets PrivateTmp=true, but it preventing jstack / jmap / etc... from connecting to the JVM
 * There was a review of the logs and discovered that Elasticsearch OOM errors in MW vagrant....fixed by increasing Xmx to 512m
 * Tgr found a bug where CirrusSearch on Vagrant throws "mapper_parsing_exception: analyzer [aa_plain] not found for field [plain]" on provision and David fixed it by adding a patch to always enable WBCS
 * We needed to normalize deepcat inputs, as it was found that deepcat was case sensitive on first letter of category name
 * Icinga reports read time out error for some checks on cloudelastic cluster, so with some team conversation, we added the option separator for elastic shard size alerts
 * David found an issue where EventBusMonologHandler was malforming UTF-8 characters, because they were possibly incorrectly encoded, resulting in send aborted (and now fixed by normalizing the request param name)
 * The team did several patches to adjust mjolnir bulk_daemon to import glent swift uploads as desired
 * We found many memory correctable errors -EDAC- elastic1029 that needed reviewing...the original issue seems to have gone away, but will need more help / work from SRE to get the server working properly (new ticket will be created)
 * Stas and Igor worked on an error where ConcurrentModificationException is on a non-grouping query with aggregates in SELECT [
 * There was a request to update Blazegraph where a normalized exception was happening with a particular query; Stas and Igor collaborated on it, adding support uncertainVars in ServiceNode and fixing NME on bind variable both by LabelService and some other clause
 * There was also a query that found HAVING in named subquery results in “non-aggregate variable in select expression” error, Igor and Stas did more collaboration to fix it
 * More Blazegraph fixes: SELECT * on query with no variables and property path results in NotMaterializedException and UnsupportedOperationException on property path in EXISTS
 * A bug was discovered in the search results page where the Commons images weren't showing up anymore (on all wiki's other than enwiki); David found the issue and fixed it
 * The Discernatron tool for labeling Wikipedia search results for relevance testing used to be available but started getting a '502' error, Erik restarted the container and it's working again
 * David worked on making sure search engines can control extract interfaces and base classes from SearchResultSet and SearchResult
 * As part of our support for the Structured Data on Commons work...hascaption (including hascaption:*) currently returns all files that ever had a caption, even if that caption has been removed via reversion or edit and this needs to be changed so that when the indexing occurs (and data is removed), the hascaption/inlabel/incaption reflects those changes
 * David worked on adding a debugging API to dump the explanation of the completion suggester scores
 * David also added support for OR in the hastemplate keyword using | (pipe)
 * The team worked on (and finished) migrating WDQS to new logging pipeline
 * A bug was filed where subpageof will sometimes display results which are not subpages of the page that we limited the search to (it should indicate that is matched against a redirect)

--
 * View all open tickets related to Discovery.
 * Looking to get involved? See tasks marked as Easy or volunteer needed