Flow/Architecture/Search

There are 3 big parts in making search work:
 * Manage ES config: this is about getting some ElasticSearch configuration right (e.g. how to interpret datatypes: stem words, highlighter config, ...) and managing the ES indices (validate, reindex, ...)
 * Index & search Flow data: self-explanatory, indexes Flow data in Elasticsearch & makes it searchable
 * Search front-end: how we'll present the search functionality to users.

The last is mostly blocked on nailing the mockups. Once we're happy with that, we can start building it.

Manage ES config
Patch: https://gerrit.wikimedia.org/r/#/c/161251/

Make CirrusSearch updateOneSearchIndexConfig.php reusable

 * Status: ✅
 * Phabricator: https://phabricator.wikimedia.org/T78786

There's been a bunch of refactoring in CirrusSearch so that we can reuse most of its code in Flow. For a list of those patches, see the Phabricator task.

Make ES configuration management maintenance script

 * Status: ✅
 * Phabricator: https://phabricator.wikimedia.org/T78787

How to use (1-4 will be done by enabling 'cirrussearch' role in MediaWiki-Vagrant). We should probably include this all in MediaWiki-Vagrant, either by default as part of Flow or as an optional role (flow-search?)


 * 1) Install ElasticSearch
 * 2) Install Extension:Elastica
 * 3) Install Extension:CirrusSearch
 * 4) Configure connection to ES (if different from the default 'localhost'):
 * 5) Flow & ES should now be in touch
 * 6) In CLI, run:  : this will prepare the search index. If you are using MediaWiki-Vagrant, you need to use   go to the   folder and run the script within the shell.
 * 7) (You could add any of the many options to that script, if you're looking to try out a particular piece)
 * 8) Should you, for some reason, need to quickly rebuild your index from scratch, kill it with   (adjust the url as needed) and re-run these steps

Figure out how to deploy Flow search

 * Status:
 * Phabricator: https://phabricator.wikimedia.org/T78796

Index & search Flow data
Patch: https://gerrit.wikimedia.org/r/#/c/126996/

Index Flow data in ES

 * Status: Status: ✅
 * Phabricator: https://phabricator.wikimedia.org/T78788

How to use

You should look at, which has more detailed instructions to also properly configure the search index.


 * 1) Do steps from
 * 2) In CLI, run:   (to ensure workflow_last_update_timestamps are correct; may not be needed)
 * 3) In CLI, run:
 * 4) Flow data should be indexing, hopefully

Search indexed Flow data

 * Status: Status: ✅
 * Phabricator: https://phabricator.wikimedia.org/T78789

How to use


 * 1) See below, API endpoint is in place already ;)

Search API endpoint

 * Status:
 * Phabricator: https://phabricator.wikimedia.org/T78791

How to use


 * 1) Do steps from
 * 2) Set
 * 3) Add   to your elasticsearch.yml (we're adding dynamic code to figure out the total amount of matching terms)
 * 4) Do an API call, e.g.:
 * 5) See search results!

Search front-end

 * Status:
 * Phabricator: https://phabricator.wikimedia.org/T78790

For mockups, see Phabricator task.

There is a patch with a very barebones GUI - it's linked to in the Phabricator task.