User:Smalyshev (WMF)/Suggester

From mediawiki.org

In order to make prefix search better, and to bring all variants of prefix search under one roof, we did some refactoring in the search engine implementation, so that various prefix searches now use the same code path and all use the SearchEngine class.

The changes are as follows:

SearchEngine gets the following new API functions:

  • public function completionSearch( $search ) - implements prefix completion search, returns SearchSuggestionSet
  • public function completionSearchWithVariants( $search ) - implements prefix completion search including variants handling, returns SearchSuggestionSet.
  • public function defaultPrefixSearch( $search ) - basic prefix search without fuzzy matching, etc., to be used in scenarios like special pages search, etc. Returns Title[].

The implementation does not have to implement all three methods differently, they can all use the same code if needed.

The default implementation still supports the PrefixSearchBackend hook but we plan to deprecate it, and the CirrusSearch implementation does not use it anymore. Instead, there is a private function, protected function completionSearchBackend( $search ), which implementations (including CirrusSearch) should implement to provide search results.

SearchEngine implementations can make use of services provided by the base SearchEngine including:

  • namespace resolution and normalization. The PrefixSearchExtractNamespace hook is still supported for engines wishing to implement namespace lookup not featured in the standard implementation.
  • fetching titles for result sets (the implementing engine does not have to fetch titles from DB for suggestions)
  • result reordering to ensure exact matches are on top
  • basic prefix search implementation using the database
  • Special: namespace search implementation

Deprecations[edit]

We plan to deprecate the PrefixSearchBackend hook and classes  TitlePrefixSearch and StringPrefixSearch. We will keep those classes around for basic search fallback implementation and for old extensions, but no new code should be using these classes, instead they should use SearchEngine APIs described above. Mediawiki code has already been fixed to do that. Extensions implementing search engines should also extend SearchEngine and override the APIs above. CirrusSearch is the example of how to do it.

Show me the code[edit]

The patches implementing the refactoring are linked from https://phabricator.wikimedia.org/T121430