User:Cmcmahon(WMF)/Search bugs draft

From mediawiki.org

The state of search bugs

For a description of search architecture, see https://wikitech.wikimedia.org/wiki/Search

There are four distinct components in Bugzilla for search:

  • "MediaWiki > Search" 116 open
  • "MediaWiki Extensions > Lucene Search" 40 open
  • "MediaWiki Extensions > MWSearch" 8 open
  • "Wikimedia > lucene-search-2" 39 open

There is a certain amount of noise in these open issues:

"MediaWiki > Search"

First, there are open issues that have been overtaken by history, as with any list. For example, https://bugzilla.wikimedia.org/show_bug.cgi?id=33322 should have been closed in late 2011 but never had been.


One example is https://bugzilla.wikimedia.org/show_bug.cgi?id=33020. It seems to have been reproducible a year ago, but possibly is not reproducible now. However, this is unlikely to be a bug in Search, it is more likely a bug in a skin, or a browser-specific bug, or possibly even a configuration problem, there is simply not enough information to know where to assign it other than Search.

  • Other open issues in "Mediawiki->Search" are less trivially noisy. For example, it is hard to know how to treat https://bugzilla.wikimedia.org/show_bug.cgi?id=34255. This issue:
    • addresses Mediawiki in the absence of Lucene, so affects only 3rd party wikis, not Wikipedia
    • seems to indicate that the root problem may be with MySQL and not Mediawiki

Other examples include bugs filed in an incorrect category. For example, https://bugzilla.wikimedia.org/show_bug.cgi?id=24414 should probably not be in "MediaWiki > Search" because it is not only tied to Wikipedia-specific search, but even shows different behavior in different Wikipedia skins (it was filed initially for Vector and later moved to Search). This bug has been open for almost three years.

"MediaWiki Extensions > MWSearch"

Of eight open bugs, seven are enhancement requests.

The one non-enhancement bug under MWSearch https://bugzilla.wikimedia.org/show_bug.cgi?id=43544 addresses an underlying/general problem which causes a subset a different bug https://bugzilla.wikimedia.org/show_bug.cgi?id=42423#c17 filed under lucene-search-2.

"MediaWiki Extensions > Lucene Search"

This component seems to be the most consistently correct. However, it may be the case that many of these reported bugs are tied by common Lucene configurations such that fixing any one may affect the behavior of others in the list, both defects and enhancements.

"Wikimedia > lucene-search-2"

Again, some bugs like https://bugzilla.wikimedia.org/show_bug.cgi?id=43158 was open and should have been closed.

Valid bugs in this component seem to have a strong overlap with "Lucene Search" component.

  • Possible management changes for Search bugs:
    • Eliminate the "MWSearch" component, move the seven enhancement requests to other components, move the one other bug to an appropriate component or comment thread.
    • Join the "Lucene" and "lucene-search-2" components in a single component for ease of use.
    • Pull Lucene-specific bugs from "Search" to leave a clean component of bugs for wikis that do not use Lucene.