Topic on Extension talk:CirrusSearch

number_format_exception: For input string: "0,7" (solved)

7
Summary by DCausse (WMF)

Bug in cirrus: task T189877, will be backported to 1.30 soon, see the thread for workaround.

StRiANON (talkcontribs)

Hi! I have a trouble after upgrade my MW to 1.30.

Search backend error during prefix search for 'search query here' after 3: number_format_exception: For input string: "0,7"

I deleted indicies in elasticsearch, then created them again, in order with instruction, but still have this error :(

Need help :(

Product Version
MediaWiki 1.30.0
PHP 7.1.14 (fpm-fcgi)
MariaDB 10.1.31-MariaDB-1~xenial
Elasticsearch 5.3.3
StRiANON (talkcontribs)

Or we can receive

Search backend error during full_text search for 'gdfg' after 2: number_format_exception: For input string: "0,5"

Why it happens?

DCausse (WMF) (talkcontribs)

Have you made any configuration change to the CirrusSearch configuration, I wonder if there are some weights passed to elastic that uses a comma instead of a period for decimal separator. One way to help us determine if the problem is related to number format would be to paste your config (can be dumped using api.php?action=cirrus-config-dump).

If the error happens for fulltext search could you also paste the output of the search result page adding &cirrusDumpQuery to the search URL.

Thanks!

StRiANON (talkcontribs)

You can see it here

I didn't change CirrusSearch params, and for elasticsearch added only few general rules,

script.inline: true
script.stored: true
action.auto_create_index: false

And fulltext error's result - https://pastebin.com/sT3Vydx7

DCausse (WMF) (talkcontribs)

While your config seems sane I see a weight with a comma in the fulltext query: "weight": "0,2" This might cause issue on elastic side. I suspect a bug in cirrus or some underlying library that transform this weight to a string by using system locale. Out of curiosity: is your system using a LOCALE set to something that uses comma for decimal separator?

A quick workaround would be to set:

$wgCirrusSearchDefaultNamespaceWeight = 1;
$wgCirrusSearchTalkNamespaceWeight = 1;

So that we stick non decimal numbers.

I may have found the culprit in Cirrus code, I'll followup there with a fix.

Thanks for your report.

StRiANON (talkcontribs)

Finally detected this trouble. Thx for idea about locale. Problem was in $wgShellLocale, which work was changed in 1.30 and now it affected lc_all instead of lc_ctype previously and so now decimal separator affects in scripts. Just removed this param and now all is ok.

StRiANON (talkcontribs)

No, my locale is en_US.UTF-8, checked by printf - uses dot. Unfortunately, solution didn't helps :( I added this two params, then deleted and created indicies again - still same error. Then I added more rules for search weights

$wgCirrusSearchDefaultNamespaceWeight = 1;
$wgCirrusSearchTalkNamespaceWeight = 1;
$wgCirrusSearchWeights = [
        'title' => 20,
        'redirect' => 15,
        'category' => 8,
        'heading' => 5,
        'opening_text' => 3,
        'text' => 1,
        'auxiliary_text' => 1,
        'file_text' => 1,
];

$wgCirrusSearchPrefixWeights = [
        'title' => 10,
        'redirect' => 1,
        'title_asciifolding' => 7,
        'redirect_asciifolding' => 1,
];

And again deleted and created indicies. And still have this trouble - look here o.O