Topic on Extension talk:CirrusSearch

How to know that elasticSearch and MW communicate ?

5
Nicolas senechal (talkcontribs)

Hello,

I try to install cirrusSearch, so I have elasticSearch running as service on a windows server. I think it's running ,here is it health.json.

{

  "cluster_name" : "elasticsearch",

  "status" : "green",

  "timed_out" : false,

  "number_of_nodes" : 1,

  "number_of_data_nodes" : 1,

  "active_primary_shards" : 13,

  "active_shards" : 13,

  "relocating_shards" : 0,

  "initializing_shards" : 0,

  "unassigned_shards" : 0,

  "delayed_unassigned_shards" : 0,

  "number_of_pending_tasks" : 0,

  "number_of_in_flight_fetch" : 0,

  "task_max_waiting_in_queue_millis" : 0,

  "active_shards_percent_as_number" : 100.0

}

But when I try to search on my wiki I have a search error that it says it's a technical error. So with this url I check if it's cirrusearch with adding &cirrusDumpQuery and I get this json.

{
    "__main__": {
        "description": "full_text search for 'sql'",
        "path": "wikig4_content\/page\/_search",
        "params": {
            "timeout": "20s",
            "search_type": "dfs_query_then_fetch"
        },
        "query": {
            "_source": [
                "namespace",
                "title",
                "namespace_text",
                "wiki",
                "redirect.*",
                "timestamp",
                "text_bytes"
            ],
            "stored_fields": [
                "text.word_count"
            ],
            "query": {
                "bool": {
                    "minimum_should_match": 1,
                    "should": [
                        {
                            "query_string": {
                                "query": "sql",
                                "fields": [
                                    "all.plain^1",
                                    "all^0.5"
                                ],
                                "phrase_slop": 0,
                                "default_operator": "AND",
                                "allow_leading_wildcard": true,
                                "fuzzy_prefix_length": 2,
                                "rewrite": "top_terms_boost_1024"
                            }
                        },
                        {
                            "multi_match": {
                                "fields": [
                                    "all_near_match^2",
                                    "all_near_match.asciifolding^1.5"
                                ],
                                "query": "sql"
                            }
                        }
                    ],
                    "filter": [
                        {
                            "terms": {
                                "namespace": [
                                    0
                                ]
                            }
                        }
                    ]
                }
            },
            "highlight": {
                "pre_tags": [
                    "\ue000"
                ],
                "post_tags": [
                    "\ue001"
                ],
                "fields": {
                    "title": {
                        "type": "fvh",
                        "number_of_fragments": 0,
                        "order": "score",
                        "matched_fields": [
                            "title",
                            "title.plain"
                        ]
                    },
                    "redirect.title": {
                        "type": "fvh",
                        "number_of_fragments": 1,
                        "order": "score",
                        "fragment_size": 10000,
                        "matched_fields": [
                            "redirect.title",
                            "redirect.title.plain"
                        ]
                    },
                    "category": {
                        "type": "fvh",
                        "number_of_fragments": 1,
                        "order": "score",
                        "fragment_size": 10000,
                        "matched_fields": [
                            "category",
                            "category.plain"
                        ]
                    },
                    "heading": {
                        "type": "fvh",
                        "number_of_fragments": 1,
                        "order": "score",
                        "fragment_size": 10000,
                        "matched_fields": [
                            "heading",
                            "heading.plain"
                        ]
                    },
                    "text": {
                        "type": "fvh",
                        "number_of_fragments": 1,
                        "order": "score",
                        "fragment_size": 150,
                        "no_match_size": 150,
                        "matched_fields": [
                            "text",
                            "text.plain"
                        ]
                    },
                    "auxiliary_text": {
                        "type": "fvh",
                        "number_of_fragments": 1,
                        "order": "score",
                        "fragment_size": 150,
                        "matched_fields": [
                            "auxiliary_text",
                            "auxiliary_text.plain"
                        ]
                    },
                    "file_text": {
                        "type": "fvh",
                        "number_of_fragments": 1,
                        "order": "score",
                        "fragment_size": 150,
                        "matched_fields": [
                            "file_text",
                            "file_text.plain"
                        ]
                    }
                },
                "highlight_query": {
                    "query_string": {
                        "query": "sql",
                        "fields": [
                            "title.plain^20",
                            "redirect.title.plain^15",
                            "category.plain^8",
                            "heading.plain^5",
                            "opening_text.plain^3",
                            "text.plain^1",
                            "auxiliary_text.plain^0.5",
                            "title^10",
                            "redirect.title^7.5",
                            "category^4",
                            "heading^2.5",
                            "opening_text^1.5",
                            "text^0.5",
                            "auxiliary_text^0.25"
                        ],
                        "phrase_slop": 1,
                        "default_operator": "AND",
                        "allow_leading_wildcard": true,
                        "fuzzy_prefix_length": 2,
                        "rewrite": "top_terms_boost_1024"
                    }
                }
            },
            "suggest": {
                "text": "sql",
                "suggest": {
                    "phrase": {
                        "field": "suggest",
                        "size": 1,
                        "max_errors": 2,
                        "confidence": 2,
                        "real_word_error_likelihood": 0.95,
                        "direct_generator": [
                            {
                                "field": "suggest",
                                "suggest_mode": "always",
                                "max_term_freq": 0.5,
                                "min_doc_freq": 0,
                                "prefix_length": 2
                            }
                        ],
                        "highlight": {
                            "pre_tag": "\ue000",
                            "post_tag": "\ue001"
                        },
                        "smoothing": {
                            "stupid_backoff": {
                                "discount": 0.4
                            }
                        }
                    }
                }
            },
            "stats": [
                "suggest",
                "full_text",
                "full_text_querystring",
                "simple_bag_of_words"
            ],
            "rescore": [
                {
                    "window_size": 8192,
                    "query": {
                        "query_weight": 1,
                        "rescore_query_weight": 1,
                        "score_mode": "multiply",
                        "rescore_query": {
                            "function_score": {
                                "functions": [
                                    {
                                        "field_value_factor": {
                                            "field": "incoming_links",
                                            "modifier": "log2p",
                                            "missing": 0
                                        }
                                    }
                                ]
                            }
                        }
                    }
                }
            ],
            "size": 21
        },
        "options": {
            "timeout": "20s",
            "search_type": "dfs_query_then_fetch"
        }
    }
}

Here is a copy of Spécial:Version

Produit Version
MediaWiki 1.37.1
PHP 8.1.2 (apache2handler)
MariaDB 10.4.22-MariaDB
ICU 70.1
Elasticsearch 6.8.23

Strange that the wiki shows the version of elasticSearch...

So how to know that elasticSearch and MW communicate well?

Any other idea is apricied

Thank you,

Spas.Z.Spasov (talkcontribs)

Hello, when the search engine is set to CirrusSearch, you will get red box with warning message within the search results page if there is a trouble with Elasticsearch.

Nicolas senechal (talkcontribs)

Thank you for your quick response , it's what I get, so what I have to do with ElasticSearch how I can check if it works properly ? because in my logs I don't have error, I check with another wiki(who work) that I use and the elasticSearch log's are the same exeptc for this line : [2022-05-02T14:23:04,386][INFO ][o.e.c.r.a.AllocationService] [Y4F2XBY] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[test_content_first][2], [test_content_first][0], [mw_cirrus_metastore_first][0]] ...]).

Nicolas senechal (talkcontribs)

So, I go to http://localhost:9200/_cat/indices?format=json&pretty and my server is OK, but I have 4 parts on my json and in my wikitest I have the same (and it works) so I don't know what I can do or where I can watch to know the issue of this...

here is the result of http://localhost:9200/_cat/indices?format=json&pretty

[
  {
    "health" : "green",
    "status" : "open",
    "index" : "test_archive_first",
    "uuid" : "jQZYnyGUStWWqDVjfLxpHg",
    "pri" : "4",
    "rep" : "0",
    "docs.count" : "0",
    "docs.deleted" : "0",
    "store.size" : "1kb",
    "pri.store.size" : "1kb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : "test_content_first",
    "uuid" : "x9Y9ACxWSg-oBLxvKbzpjw",
    "pri" : "4",
    "rep" : "0",
    "docs.count" : "5",
    "docs.deleted" : "1",
    "store.size" : "44.2kb",
    "pri.store.size" : "44.2kb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : "mw_cirrus_metastore_first",
    "uuid" : "rIRWtNZ_T6GxuLrKH6lstw",
    "pri" : "1",
    "rep" : "0",
    "docs.count" : "25",
    "docs.deleted" : "6",
    "store.size" : "15.4kb",
    "pri.store.size" : "15.4kb"
  },
  {
    "health" : "green",
    "status" : "open",
    "index" : "test_general_first",
    "uuid" : "39kWAi7cSnyME6R0BWlhyQ",
    "pri" : "4",
    "rep" : "0",
    "docs.count" : "21",
    "docs.deleted" : "4",
    "store.size" : "192kb",
    "pri.store.size" : "192kb"
  }
]
Nicolas senechal (talkcontribs)

I test with my production setting of media wiki, on my test wiki everything it's OK, so... if it's not the server, not the wiki, not the communication between server and wiki. The only thing that I see it's a server response problem or server don't index the pages with the database... so how I can test that , how I can view the connection between database and elasticSearch because after the look on Google, I don't find some test with MW?


So I follow UPGRADE and now I don't have any error (yeah) but I have no result so, I think I should index but, the first part of upgrade alrady do that?

I have a warrning with the segond part, I don't know if it's important or not, so I passed out.

# php metastore.php --upgrade
PHP Warning:  Undefined array key "REMOTE_ADDR" in D:\WikiG4\xampp\htdocs\WikiG4\LocalSettings.php on line 138
Warning: Undefined array key "REMOTE_ADDR" in D:\WikiG4\xampp\htdocs\WikiG4\LocalSettings.php on line 138
mw_cirrus_metastore is up and running with version 2.0

here is the part of the warning in my localsettings.

$wgGroupPermissions['*']['edit'] = false;
$wgGroupPermissions['interface-admin']['gadgets-edit'] = true;//config gadget
$wgGroupPermissions['interface-admin']['gadgets-definition-edit'] = true;//config gadget
if ( $_SERVER['REMOTE_ADDR'] == $serverAdress ) {
  $wgGroupPermissions['*']['read'] = true;
  $wgGroupPermissions['*']['edit'] = true;
  $wgGroupPermissions['*']['writeapi'] = true;
}

Sorry I forgot why I put that but, I think it's an error issu, because my wiki it's private, so with some extention it could be have a bug so here is the solution...

Reply to "How to know that elasticSearch and MW communicate ?"