Topic on Help talk:CirrusSearch

What is the recipe for properly re-initializing Elastic/CirrusSearch?

5
WhitWye (talkcontribs)

Somehow I've ended up with CirrusSearch mostly working, but failing entirely to find some terms known to be in the imported wiki. Also, as we keep a live backup of our wiki to which we nightly import the whole of the main one, we should have the search DB there thoroughly refreshed each night. What is the proper formula to purge and rebuild the search DB? Apologies if it's someplace that should be obvious, which I've so far missed.

EBernhardson (WMF) (talkcontribs)

CirrusSearch contains a maintenance script called forceSearchIndex.php for this purpose. It can be invoked something like the following. This will essentially queue up to 10k indexing jobs, wait for that to go down to ~1k jobs (to prevent dominating the job queue and forcing other jobs to wait for the entire process to complete), and then push more jobs up to 10k in a repeated fashion.


php extensions/CirrusSearch/maintenance/forceSearchIndex.php --queue --maxJobs 10000 --pauseForJobs 1000

WhitWye (talkcontribs)

Running that script on an otherwise idle system, after a string of "Queued 100 pages" messages there's a seemingly endless repeat of "[              wikidb] 179 jobs left on the queue." After many minutes of that htop is showing a load between 0.00 and 0.01. Is there a prerequisite to running this maintenance script successfully? Running it without the flags I see it runs into a parsing error:


MWException from line 348 of /var/www/mediawiki-1.34.0/includes/parser/ParserOutput.php: Bad parser output text. ....


Obviously I should report a bug: https://phabricator.wikimedia.org/T244603

EBernhardson (WMF) (talkcontribs)

I see in the phab ticket you came up with a temporary solution to the parser failure. With that somewhat resolved, does the reindexing complete?

Legaulph (talkcontribs)

wrong

Reply to "What is the recipe for properly re-initializing Elastic/CirrusSearch?"