Topic on Extension talk:CirrusSearch

Indexing WIKI after a database restore

3
Raoufgui (talkcontribs)

Hello

i have two MW servers that work fine :

1 - production server

2- backup server

i will restore the data base backuped from production server to the second server in order to move it to run

the database on the production server is more recent and contains more data.


After each restore of the DB and running the update script :

- should i build index from scratch on the second server ?

- if no, does the new data (difference of data between the 02 DB) will be automatically indexed or shoud i run specific script to index the new data

-how to confirm that all data are indexed on the second server and i will have the same results of search like the first server ?

NB: I'm using CirrusSearch plugin and elasticsearch

Thanks

DCausse (WMF) (talkcontribs)

I'm assuming here that all the Mediawiki dependencies are running on the same server: PHP, your database and elasticsearch, if not please be careful, especially if your elasticsearch cluster is shared between your production and backup installation.

If this is the case, when restoring a database backup you should also reindex everything from scratch. The same way that your relational database will get erased by restoring the backup, elasticsearch also needs to be reset based on the new content of the restored database. This is the easiest and safest solution.

There are no ways to ensure that the same query will return identical results on two different elasticsearch servers, reason is that ranking uses some stats that will certainly differ even if the documents are the same. What you could do is run some sanity checks, e.g. counting the number of indexed documents in both elasticsearch servers to make sure that they are close.

Raoufgui (talkcontribs)
Reply to "Indexing WIKI after a database restore"