Topic on Extension talk:CirrusSearch

An error has occurred while searching: We could not complete your search due to a temporary problem.

7
Clarasiir (talkcontribs)

For the past month of so, CirrusSearch has suddenly and randomly stopped working and given the message "An error has occurred while searching: We could not complete your search due to a temporary problem. Please try again later."

Our wiki has used CirrusSearch for a good while now with no issues, but recently our traffic has slowly been improving, and that is when trouble with search began. Restarting our VPS would solve the issue, but over time search would eventually go down again. As wiki traffic gradually increased, so did the frequency of the error, up to the point where search would go down daily.

Thinking it might be a memory use issue, I created a custom.options file in elasticsearch/jvm.options.d with the settings

-Xms3g

-Xmx3g

Nothing changed at first, as I didn't restart ElasticSearch, but the next morning search was down per usual, so I rebooted the VPS to get it working again. This time, that didn't solve the problem. The message "An error has occurred while searching" was still appearing.

I deleted the custom.options file I had created, and rebooted the VPS again. Still this didn't solve the problem.

To avoid not having any search function at all, we're now using the default mediawiki search. But I would much rather have CirrusSearch back again, so does anyone know what I should do to solve this issue and stop search giving nothing but error messages?

DCausse (WMF) (talkcontribs)

Hi,

I would suggest to analyze the elasticsearch logs to understand if it is having issues and why. The error you describe could have a wide variety of causes:

  • network issue between mediawiki and elastic
  • health status of your search indexes
  • elasticsearch crashing

https://www.elastic.co/guide/en/elasticsearch/reference/8.13/fix-common-cluster-issues.html might be interesting, note that this doc is for 8.13 and you might be running an older version but I suspect that most information you will find there still applies for 7x.

Please let us know if you have more precise information about the issue you are facing.

Good luck!

Clarasiir (talkcontribs)

Okay, well I checked the elasticsearch.log but it didn't seem to have anything useful. It did have the message "Native controller process has stopped - no new native processes can be started," but there were no other error messages or an explanation as to why search stopped. I'm not even sure if that's an error or that's just when I disabled CirrusSearch because it had already stopped working anyway and was showing the "error has occurred while searching" message.

I have noticed that with CirrusSearch disabled, our server's total memory use is very low. Just enabling CirrusSearch makes it jump to over 65%, and as time passes that number will slowly creep higher to around 80-83% before search then goes down.

To me that sounds similar to the "Circuit breaker errors" in the guide you linked, but there's no error like that in the logs. Our wiki does use an older 7.x version of elasticsearch, so I don't know if error logs work differently for this older version?

DCausse (WMF) (talkcontribs)

Unfortunately without more details I can only give you very broad guidance only, first I would try to understand if elasticsearch dies or not. It could be killed by the JVM itself because of high GC overhead but in that case you would see an error in the logs or it could be killed by the system oomkiller (which can be inspected in system logs or dmesg). Circuit breaker errors should also be logged in elasticsearch logs so you would have seen those, but if you believe this is affecting your setup please see: https://www.elastic.co/guide/en/elasticsearch/reference/7.17/fix-common-cluster-issues.html#circuit-breaker-errors

Have you followed https://www.elastic.co/guide/en/elasticsearch/reference/7.10/system-config.html when setting up elasticsearch? If not I would encourage you to follow this documentation and make sure that your system is properly setup for elasticsearch to run smoothly.

Hope it helps, good luck!

Clarasiir (talkcontribs)

I'm not sure what more details you want me to provide, but our elasticsearch certainly does not have a server to itself, and I wouldn't want to change the settings as if it did only to have that crash our server or something.

We use a shared VPS server with 4 cores and our container having 8 GB guaranteed ram. That has been enough for elasticsearch to function with the default settings up until recently.

DCausse (WMF) (talkcontribs)

As I said earlier the CirrusSearch error message alone is not precise enough to identify the cause of the issue and without knowing the cause I can't guide you on a solution. I can only advise you to continue troubleshooting the problem until you understand its cause. Have you tried seeking for help on other forums more dedicated to elasticsearch? You might certainly get more precise guidance on how to troubleshoot an elasticsearch instance.

Clarasiir (talkcontribs)

I see, then I will try an elasticsearch forum and see if I can get more help figuring things out there, thank you.

Reply to "An error has occurred while searching: We could not complete your search due to a temporary problem."