Flow/Architecture/Search

There are 3 big parts in making search work:
 * Index & search Flow data: self-explanatory, indexes Flow data in Elasticsearch & makes it searchable
 * Manage ES config: this is about getting some ElasticSearch configuration right (e.g. how to interpret datatypes: stem words, highlighter config, ...) and managing the ES indices (validate, reindex, ...)
 * Search front-end: how we'll present the search functionality to users.

The first part is mostly done already.

2nd is still actively being worked on - lot of patches to CirrusSearch have been merged, still a lot of work on the Flow-specific maintenance script (which will use those refactored pieces)

The last is mostly blocked on nailing the mockups. Once we're happy with that, we can start building it. For the most part, we're not blocked on the 2nd part, since we can (mostly) use default ES config on our development devices. The only thing I can think of that we'll be blocked on #2 for, is highlighting.

Index & search Flow data
Patch: https://gerrit.wikimedia.org/r/#/c/126996/

Index Flow data in ES

 * Status: Status: ✅
 * Phabricator: https://phabricator.wikimedia.org/T78788

How to use


 * 1) Install ElasticSearch
 * 2) Install Extension:Elastica (no need for [[Extension:CirrusSearch, yet)
 * 3) Pull this patch:
 * 4) Configure connection to ES:
 * 5) Flow & ES should now be in touch
 * 6) In CLI, run:
 * 7) Flow data should be indexing, hopefully

Search indexed Flow data

 * Status: Status: ✅
 * Phabricator: https://phabricator.wikimedia.org/T78789

How to use


 * 1) See below, API endpoint is in place already ;)

Search API endpoint

 * Status:
 * Phabricator: https://phabricator.wikimedia.org/T78791

How to use


 * 1) Do above steps to let Flow talk to ES & index data
 * 2) Do an API call, e.g.:
 * 3) See search results!

Note that the ?page= parameter is not currently being used in Flow API. That has only been introduced after I had already written that code. The parameter is currently ignored; instead use ?qtitle= or ?qpageid= to limit your search to specific pages. This will need to be addressed.

Manage ES config
Patch: https://gerrit.wikimedia.org/r/#/c/161251/

Make CirrusSearch updateOneSearchIndexConfig.php reusable

 * Status:
 * Phabricator: https://phabricator.wikimedia.org/T78786

There's been a bunch of refactoring in CirrusSearch so that we can reuse most of its code in Flow. For a list of those patches, see the Phabricator task.

Make ES configuration management maintenance script

 * Status:
 * Phabricator: https://phabricator.wikimedia.org/T78787

How to use


 * 1) Do all of the things in Index Flow data in ES
 * 2) Install Extension:CirrusSearch (and make sure all the refactoring patches are in)
 * 3) In CLI, run:
 * 4) You could add any of the many options to that script, if you're looking to try out a particular piece

NOTE: As of writing this article, this does not yet work - it's still being worked on!

Search front-end

 * Status:
 * Phabricator: https://phabricator.wikimedia.org/T78790

For mockups, see Phabricator task.

There is an (unmaintained but probably easy to rebase) patch with a very barebones GUI - it's linked to in the Phabricator task.