Topic on Help talk:CirrusSearch

No results by any means

11
RodolfoEBDR (talkcontribs)

CirrusSearch0.2 (0be5deb) Elastica1.3.0.0 (75e2f58)

MediaWiki 1.31.0

PHP 7.0.30-0+deb9u1 (apache2handler)

MariaDB 10.1.26-MariaDB-0+deb9u1

ICU 57.1

Elasticsearch 5.6.4

Debian Stretch

With that data, I'm having a major trouble in my wiki. I'm getting no results when searching. Files are indexed. When adding ?action=cirrusDump to any URL returns the right data in JSON (I think). When trying a query via curl (in command line) it retrieves the right results (for example curl -X GET "127.0.0.1:9200/_search?q=SOCIAL&pretty" ). LocalSettings.php section lists this:

$wgServer = "https://192.168.0.154";

[...]

wfLoadExtension( 'PdfHandler' );

wfLoadExtension('PDFEmbed');

wfLoadExtension( 'Elastica' );

require_once "$IP/extensions/CirrusSearch/CirrusSearch.php";

#$wgDisableSearchUpdate = true;

$wgCirrusSearchServers = ['127.0.0.1'];

$wgSearchType = 'CirrusSearch';

The var/log/daemon shows nothing rare. No error. No strange message. Search Engine is ALWAYS retrieving null results. Could it be something about stunneling? If yes, how can I solve the issue? I'm accessing to my wiki from "outside" by https://192.168.0.154 (I'm in a local network and the server wiki is on a VM on the same network) . Altough if I try curl to httpS://127.0.0.1:9200 i get the message curl: (35) error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol . Can it be something from this perspective? The URLs generated by the Search are like this ("Squad" is a word in that is inside a Wiki page and my wiki language is set to Spanish):

https://192.168.0.154/index.php?search=Squad&title=Especial:Buscar&go=Ir&searchToken=2m258n64r6folcjwvc8ad54un

Please heeeelp! I've trying everything I've Googled and metasearching in Help_Talk but I'm really frustrated. The main target of using CirrusSearch+Elastica+Elasticsearch is searching inside the content of PDF Files (I'm building an knowledge wiki). Thank you for reading!

RodolfoEBDR (talkcontribs)
EBernhardson (WMF) (talkcontribs)

One suspicious part of your post is:

curl: (35) error:140770FC:SSL routines:SSL23_GET_SERVER_HELLO:unknown protocol

A common way to get this error is to contact an http service via https. If you connect via plain http does this work?

RodolfoEBDR (talkcontribs)

Hi, @EBernhardson (WMF)! That's correct. If I try via plain http it works. I've try debugging CirrusSearch via log and it doesn't add "errors". There are some entries but they are not generate by the Wiki query (I think, based on the timestamps).

2018-07-25 14:23:49 mediawiki mediawiki: Response does not has any data. <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

<html><head>

<title>404 Not Found</title>

</head><body>

<h1>Not Found</h1>

<p>The requested URL /_msearch was not found on this server.</p>

</body></html>

Is there any way to debug step-by-step what the Wiki-Cirrus-Elastica-ES........ are trying to do after push the "Search" button?

RodolfoEBDR (talkcontribs)

I've disabled SSL on apache2 availabe-sites and also set "$wgServer = "http:" (delete the S). The Wiki opens right, but still zero results. Anyone?

RodolfoEBDR (talkcontribs)

OK, I've detected something really weird:

I've installed httpry (that scans http ports) and executed httpry -i eth0 (eth0 is my network interface). Browsing the wiki writes the right log (capture). When I push the Search button, it captures this:

2018-07-26 14:21:09     192.168.0.22    192.168.0.154   >       GET     192.168.0.154   /load.php?debug=false&lang=es&modules=mediawiki.helplink%2CsectionAnchor%2Cspecial%2Cui%7Cmediawiki.legacy.commonPrint%2Cshared%7Cmediawiki.skinning.interface%7Cmediawiki.special.search.styles%7Cmediawiki.ui.button%2Cinput%7Cmediawiki.widgets.SearchInputWidget.styles%7Cmediawiki.widgets.styles%7Coojs-ui-core.styles%7Coojs-ui.styles.icons-alerts%2Cicons-content%2Cicons-interactions%2Cindicators%2Ctextures%7Cskins.vector.styles&only=styles&skin=vector        HTTP/1.1        -       -

2018-07-26 14:21:09     192.168.0.154   192.168.0.22    <       -       -       -       HTTP/1.1        304     Not Modified

That made me think outside the box a little, and see what happen if a execute an ElasticSearch query from outside the wiki (Firefox in an external PC that actually can access the wiki): http://192.168.0.154:9200/_search?q=Vasco&pretty . Funny thing: it returns the right hits (in plain text / JSON) but the httpry did not capture ANYTHING. No logs, no registrer. Then I've tcpdump port 9200 and effectively it searches the right way thru ElasticSearch, and ES retrieves the right data...

So... I'm thinking effectively that the issue is between the way Elastica? CirrusSearch? is sending the requests and how Apache? is receiving those requests.

EBernhardson (WMF) (talkcontribs)

The error message The requested URL /_msearch was not found on this server. is very suspicious. My best guess here would perhaps the server with elasticsearch on it has not only elasticsearch on 9200, but perhaps http on port 80? Not finding _msearch tells me that the http server connected to is not an elasticsearch instance.

Port 9200 should be the default, but what if you define your elasticsearch connection more explicitly?

['host' => 'my.host.wherever', 'port' => 9200]

RodolfoEBDR (talkcontribs)

Thank you , @EBernhardson (WMF). I've changed wgCirrusSearchServer param with that option but anything happen. I'm starting to think that I'm jinxed. Hahaha

RodolfoEBDR (talkcontribs)

:'(

EBernhardson (WMF) (talkcontribs)

Did you use the following?

$wgCirrusSearchServers = [

['host' => 'my.host.whatever', 'port' => 9200]

];

You asked earlier if you could step through the code. This is possible with https://xdebug.org/. There are a variety of interfaces and instructions for setting that up. A reasonable spot to set a breakpoint and start stepping from would be ElasticaConnection::getClient()

RodolfoEBDR (talkcontribs)

Hi, @EBernhardson (WMF). Yes, I've tried. Nothing happened. I've quit the project for now. I was getting really frustrated and blocked, so I decided to suspend the Wiki for a while. Thanks for your help.

Reply to "No results by any means"