Jump to content

Extension talk:CirrusSearch/2013

Add topic
From mediawiki.org

Install instructions for Linux -- now fixed in new README

[edit]

I am using Sphinx search for all my wikis at the moment, but would like to try out CirrusSearch. Are there any install instructions for Linux (maybe even Ubuntu) yet? SmartK (talk) 07:07, 9 September 2013 (UTC)Reply

Found the README file on git: https://git.wikimedia.org/tree/mediawiki%2Fextensions%2FCirrusSearch.git
But there are some mistakes when it comes to the commands like "php maintenance/forceSearchIndex.php", .... these either do not exist or are named differently or are in diffenrent directories than it says in the manual! SmartK (talk) 14:00, 9 September 2013 (UTC)Reply
So now CirrusSearch is working for me! Great work Nik and Chad!!! I would still recommend to update the README in GIT:
1. there are 2 mistakes, one in the directory name and one in the file name!
php maintenance/updateSearchConfig.php --> php extensions/CirrusSearch/updateSearchIndexConfig.php
2. just the wrong path is written in the README
php maintenance/forceSearchIndex.php --> php extensions/CirrusSearch/forceSearchIndex.php
3. And one big wish from me: could you enable the suggestions while typing not only for the first word but also for the following words. In Sphinx this was a great feature! SmartK (talk) 15:07, 9 September 2013 (UTC)Reply
SmartK, thank you for your tips!
Could you explain how to transform the 'http://localhost:9200', 'http://localhost:9201', etc. into 'elasticsearch0', 'elasticsearch1'? I did not understand this part:
$wgCirrusSearchServers = array( 'elasticsearch0', 'elasticsearch1', 'elasticsearch2', 'elasticsearch3' );
When I put $wgCirrusSearchServers = array( 'http://localhost:9200', 'http://localhost:9201' ); I get:
Unexpected non-MediaWiki exception encountered, of type "Elastica\Exception\ClientException" exception
'Elastica\Exception\ClientException' with message 'No enabled connection' in
/.../w/extensions/CirrusSearch/Elastica/lib/Elastica/Client.php:443 Jaider msg 01:00, 11 September 2013 (UTC)Reply
I use
$wgCirrusSearchServers = array( 'localhost' );
without the http and without the port. I also just use one instance of elasticsearch as I only have one server running elasticsearch for now. I would recommend this for the beginning! SmartK (talk) 14:27, 11 September 2013 (UTC)Reply
It works! Thanks for your advice, SmartK! And great work Nik and Chad! Jaider msg 15:49, 11 September 2013 (UTC)Reply
The new README now shows the correct .php files. Thank you "Chad". SmartK (talk) 15:53, 4 October 2013 (UTC)Reply

updateSearchIndexConfig.php

[edit]
  • When running the git version from today this happens (worked 2 weeks ago):
php extensions/CirrusSearch/updateSearchIndexConfig.php
  • This is the output:
content index...
        Infering index identifier...first
        Creating index...ok
        Validating analyzers...ok
        Validating mappings...
                Validating mapping for page type...different...corrected
        Validating aliases...
                Validating content alias...alias is free...corrected
                Validating all alias...Unexpected non-MediaWiki exception encountered, of type "Elastica\Exception\ResponseException" SmartK (talk) 08:10, 4 October 2013 (UTC)Reply
Fixed in git master since last week. Thank you guys... SmartK (talk) 12:29, 16 October 2013 (UTC)Reply

Missing buttons on Search page

[edit]

Could this extension be causing this problem? I have my doubts… dcljr (talk) 03:29, 13 October 2013 (UTC)Reply

It was something else that has now been fixed. So nevermind. dcljr (talk) 06:23, 29 October 2013 (UTC)Reply

Multi namespace suggest

[edit]

I recently installed cirrussearch and am trying to figure out how to get search suggestions from multiple namespaces. Here's an example using Main and Category; type 'air' and get suggestions for 'airman', 'air density', but also 'Category:Airplanes'.

I've appropriately configured $wgNamespacesToBeSearchedDefault, but this doesn't seem to affect suggestions the same as it does search results. Is there anything else that needs to be configured? Chucka (talk) 12:27, 18 November 2013 (UTC)Reply

I've figured this out. The issue lies in the js call sent to the Opensearch API. Below are the tests I ran.
Tested hardcoding the namespaces array (e.g. $this->namespaces = array(0,14);) in the __construct function of CirrusSearchSearcher.php. With this in place, I can get valid suggestions from both namespaces.
Tested a query via the Opensearch API: api.php?action=opensearch&search=air&limit=10&namespace=0|14&format=json
  • With the default search enabled, only Main namespace suggestions were returned.
  • With CirrusSearch enabled, the Main and Category suggestions were returned.
Noticed that the js call to Opensearch from within the wiki always sends namespace=0. Found the file resources/mediawiki/mediawiki.searchSuggest.js and changed the hardcoded '0' to '0|14' on line 149, now I receive valid suggestions from both namespaces.
Ancient Bugzilla report on this issue: https://bugzilla.wikimedia.org/show_bug.cgi?id=24214 Chucka (talk) 19:49, 18 November 2013 (UTC)Reply

DOMDocument::loadHTML(): Empty string supplied

[edit]

When I run php forceSearchIndex.php, this happens:

...
Indexed 50 pages ending at 1523 at 11/second
Warning: DOMDocument::loadHTML(): Empty string supplied as input in /.../w/includes/HtmlFormatter.php on line 79
Indexed 50 pages ending at 1608 at 11/second
...

This is a bug? Should I worry about that? Jaider msg 13:46, 23 November 2013 (UTC)Reply

Most likely it'll break the index for that page. If you can still recreate it please file it and I'll have a look. Sorry for taking so long on this. NEverett (WMF) (talk) 17:29, 26 February 2014 (UTC)Reply