Topic on Extension talk:CirrusSearch

Nrm (talkcontribs)

I've an error when i'm doing some research with CirrusSearch activated. I've installed Elastica extension, elasticsearch 1 and before 0.90.11 as tipped by Lsilverman and CirrusSearch via git version and before distributed version to try. The population of the data into elasticsearch is working, I'm visualizing this informations via _head plugin of elasticsearch.

The error is the following in elasticsearch.log :

   org.elasticsearch.search.SearchParseException: [pre_mediawiki_content][3]: query[filtered(((title.plain:test^20.0 | heading.plain:test^5.0 | text.plain:test | file_text.plain:test^0.8 | title:test^10.0 | heading:test^2.5 | text:test^0.5 | file_text:test^0.4) title.near_match:test)~1)->cache(namespace: )],from[-1],size[-1]: Parse Failure [Failed to parse source [{"fields":["id","title","namespace","timestamp","text_bytes","text.word_count"],"script_fields":{"redirect":{"script":"_source[ \"redirect\" ]"}},"query":{"filtered":{"query":{"bool":{"minimum_number_should_match":1,"should":[{"query_string":{"query":"test","fields":["title.plain^20","heading.plain^5","text.plain^1","file_text.plain^0.8","title^10","heading^2.5","text^0.5","file_text^0.4"],"auto_generate_phrase_queries":true,"phrase_slop":1,"default_operator":"AND","allow_leading_wildcard":false,"fuzzy_prefix_length":2}},{"query_string":{"query":"test","fields":["title.near_match^40"],"auto_generate_phrase_queries":true,"phrase_slop":1,"default_operator":"AND","allow_leading_wildcard":false,"fuzzy_prefix_length":2}}]}},"filter":{"terms":{"namespace":[0]}}}},"highlight":{"order":"score","pre_tags":["<span class=\"searchmatch\">"],"post_tags":[""],"fields":{"title":{"number_of_fragments":0,"type":"fvh","matched_fields":["title","title.plain"]},"text":{"number_of_fragments":1,"fragment_size":100,"type":"fvh","no_match_size":100,"matched_fields":["text","text.plain"]},"file_text":{"number_of_fragments":1,"fragment_size":100,"type":"fvh","matched_fields":["file_text","file_text.plain"]},"redirect.title":{"number_of_fragments":1,"fragment_size":10000,"type":"plain"},"heading":{"number_of_fragments":1,"fragment_size":10000,"type":"plain"},"redirect.title.plain":{"number_of_fragments":1,"fragment_size":10000,"type":"plain"},"heading.plain":{"number_of_fragments":1,"fragment_size":10000,"type":"plain"}}},"suggest":{"text":"test","title":{"phrase":{"field":"title.suggest","size":1,"max_errors":2,"confidence":2,"direct_generator":[{"field":"title.suggest","suggest_mode":"always","max_term_freq":0.5}],"highlight":{"pre_tag":"","post_tag":""}}}},"stats":["suggest","full_text"],"size":20,"rescore":{"window_size":8192,"query":{"rescore_query":{"function_score":{"functions":[{"script_score":{"script":"log10((doc['incoming_links'].isEmpty() ? 0 : doc['incoming_links'].value) + 2)"}}]}},"query_weight":1,"rescore_query_weight":1,"score_mode":"multiply"}}}]]
       at org.elasticsearch.search.SearchService.parseSource(SearchService.java:581)
       at org.elasticsearch.search.SearchService.createContext(SearchService.java:484)
       at org.elasticsearch.search.SearchService.createContext(SearchService.java:469)
       at org.elasticsearch.search.SearchService.createAndPutContext(SearchService.java:462)
       at org.elasticsearch.search.SearchService.executeDfsPhase(SearchService.java:168)
       at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteDfs(SearchServiceTransportAction.java:168)
       at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.sendExecuteFirstPhase(TransportSearchDfsQueryThenFetchAction.java:85)
       at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:216)
       at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction.performFirstPhase(TransportSearchTypeAction.java:203)
       at org.elasticsearch.action.search.type.TransportSearchTypeAction$BaseAsyncAction$2.run(TransportSearchTypeAction.java:186)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
       at java.lang.Thread.run(Thread.java:636)
   Caused by: org.elasticsearch.ElasticSearchIllegalArgumentException: No mapping found for field [title.suggest]
       at org.elasticsearch.search.suggest.phrase.PhraseSuggestParser.parseCandidateGenerator(PhraseSuggestParser.java:286)
       at org.elasticsearch.search.suggest.phrase.PhraseSuggestParser.parse(PhraseSuggestParser.java:96)
       at org.elasticsearch.search.suggest.SuggestParseElement.parseInternal(SuggestParseElement.java:90)
       at org.elasticsearch.search.suggest.SuggestParseElement.parse(SuggestParseElement.java:48)
       at org.elasticsearch.search.SearchService.parseSource(SearchService.java:569)
       ... 12 more

when I remove the ".suggest" part of the request, it's a valid request for elasticsearch.

 "suggest": {
   "text": "test",
   "title": {
     "phrase": {
       "field": "title", // instead of "title.suggest"
       "size": 1,
       "max_errors": 2,
       "confidence": 2,
       "direct_generator": [
         {
           "field": "title", // instead of "title.suggest"
           "suggest_mode": "always",
           "max_term_freq": 0.5
         }
       ],
       "highlight": {
         "pre_tag": "",
         "post_tag": ""
       }
     }
   }
 },

When I remove the ".suggest" part in include/Search.php line 488

 self::SUGGESTION_NAME_TITLE => $this->buildSuggestConfig( 'title' )

I get this error :

   [2014-03-11 09:41:03,176][DEBUG][action.search.type       ] [Shen Kuei] [993] Failed to execute fetch phase
   org.elasticsearch.ElasticSearchIllegalArgumentException: the field [title] should be indexed with term vector with position offsets to be used with fast vector highlighter
       at org.elasticsearch.search.highlight.FastVectorHighlighter.highlight(FastVectorHighlighter.java:68)
       at org.elasticsearch.search.highlight.HighlightPhase.hitExecute(HighlightPhase.java:117)
       at org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:197)
       at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:434)
       at org.elasticsearch.search.action.SearchServiceTransportAction.sendExecuteFetch(SearchServiceTransportAction.java:406)
       at   org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction.executeFetch(TransportSearchDfsQueryThenFetchAction.java:249)
       at org.elasticsearch.action.search.type.TransportSearchDfsQueryThenFetchAction$AsyncAction$5.run(TransportSearchDfsQueryThenFetchAction.java:233)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
       at java.lang.Thread.run(Thread.java:636)
NEverett (WMF) (talkcontribs)

I believe we solved this on IRC but we didn't track down the exact cause. Starting the instructions in the README over again seemed to do the trick. I *think* this is caused by skipping the php $MW_INSTALL_PATH/extensions/CirrusSearch/maintenance/updateSearchIndexConfig.php step but I can't be sure now that the problem is gone.

Nrm (talkcontribs)

To respond to that, the instruction on README were followed, without success, I had the error described in "Can't create index" in this page with elasticsearc 1.0 in first. After that I tried elasticsearch 0.90.11 without success and let this version. after many installation I effectively didn't restarted updateSearchIndexConfig.php because the script forceSearchIndex seemed to automatically recreate the index -> this is a huge mistake that i've made.

So I believe the solution to make everything work is to get elasticsearch 1.0, assure to have Elastica and CirrusSearch in REL1_22 if core is in REL1_22 and use the scripts described in readme document.

NEverett (WMF) (talkcontribs)

Cool. I'm glad you figured it out. The autocreate indexes feature in Elasticsearch is useful but not so much here.... They recommend turning it off if you don't like it but since that isn't the default you end up with things like this.

Reply to "Request error"