Flow/Architecture/Search

There are 3 big parts in making search work:
 * Index & search Flow data: self-explanatory, indexes Flow data in Elasticsearch & makes it searchable
 * Manage ES config: this is about getting some ElasticSearch configuration right (e.g. how to interpret datatypes: stem words, highlighter config, ...) and managing the ES indices (validate, reindex, ...)
 * Search front-end: how we'll present the search functionality to users.

The first part is mostly done already.

2nd is still actively being worked on - lot of patches to CirrusSearch have been merged, still a lot of work on the Flow-specific maintenance script (which will use those refactored pieces)

The last is mostly blocked on nailing the mockups. Once we're happy with that, we can start building it. For the most part, this is not blocked on the 2nd part, since we can (mostly) use default ES config on our development devices. The only thing I can think of that we'll be blocked on #2 for, is highlighting.

Index & search Flow data
Patch: https://gerrit.wikimedia.org/r/#/c/126996/

Index Flow data in ES

 * Status: Status: ✅
 * Phabricator: https://phabricator.wikimedia.org/T78788

How to use

You should look at, which has more detailed instructions to also properly configure the search index. However, should there be a problem with that patch, you might be able to get by with just indexing everything in a default ES:


 * 1) Install ElasticSearch
 * 2) Install Extension:Elastica (no need for [[Extension:CirrusSearch, yet)
 * 3) Pull this patch:
 * 4) Configure connection to ES:
 * 5) Flow & ES should now be in touch
 * 6) In CLI, run:
 * 7) Flow data should be indexing, hopefully
 * 8) Should you, for some reason, need to quickly rebuild your index from scratch, kill it with   (adjust the url as needed) and re-run these steps

Search indexed Flow data

 * Status: Status: ✅
 * Phabricator: https://phabricator.wikimedia.org/T78789

How to use


 * 1) See below, API endpoint is in place already ;)

Search API endpoint

 * Status:
 * Phabricator: https://phabricator.wikimedia.org/T78791

How to use


 * 1) Do above steps to let Flow talk to ES & index data
 * 2) Do an API call, e.g.:
 * 3) See search results!

Note that the ?page= parameter is not currently being used in Flow API. That has only been introduced after I had already written that code. The parameter is currently ignored; instead use ?qtitle= or ?qpageid= to limit your search to specific pages. This will need to be addressed.

Manage ES config
Patch: https://gerrit.wikimedia.org/r/#/c/161251/

Make CirrusSearch updateOneSearchIndexConfig.php reusable

 * Status: ✅
 * Phabricator: https://phabricator.wikimedia.org/T78786

There's been a bunch of refactoring in CirrusSearch so that we can reuse most of its code in Flow. For a list of those patches, see the Phabricator task.

Make ES configuration management maintenance script

 * Status: ✅
 * Phabricator: https://phabricator.wikimedia.org/T78787

How to use


 * 1) Install ElasticSearch
 * 2) Install Extension:Elastica
 * 3) Install Extension:CirrusSearch (and make sure all the refactoring patches are in)
 * 4) Pull this patch:   (it's dependent on the patch to index/search the data, so that'll be pulled in as well)
 * 5) Configure connection to ES:
 * 6) Flow & ES should now be in touch
 * 7) In CLI, run:  : this will prepare the search index
 * 8) (You could add any of the many options to that script, if you're looking to try out a particular piece)
 * 9) In CLI, run:  : this will index Flow data
 * 10) Flow data should be indexing, hopefully
 * 11) Should you, for some reason, need to quickly rebuild your index from scratch, kill it with   (adjust the url as needed) and re-run these steps

Figure out how to deploy Flow search

 * Status:
 * Phabricator: https://phabricator.wikimedia.org/T78796

Search front-end

 * Status:
 * Phabricator: https://phabricator.wikimedia.org/T78790

For mockups, see Phabricator task.

There is an (unmaintained but probably easy to rebase) patch with a very barebones GUI - it's linked to in the Phabricator task.