Topic on Extension talk:Lucene-search

Help - port 8123 refusing all connections after Linux update & reboot

12
Maiden taiwan (talkcontribs)

After a "yum update" on our Linux server, and a reboot, Lucene is no longer listening on port 8123. We have not changed any Lucene config files.

$ telnet localhost 8123
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused
telnet: Unable to connect to remote host: Connection refused

lsearchd is running and is listening on port 8321 for incremental reindexes. Java is running as well. When I start lsearchd manually, it says:

sudo  /usr/local/bin/lucene-run
RMI registry started.
Trying config file at path /root/.lsearch.conf
Trying config file at path /usr/local/lucene-search-2.1.3/lsearch.conf
0    [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En
727  [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound
730  [Thread-1] INFO  org.wikimedia.lsearch.frontend.HTTPIndexServer  - Indexer started on port 8321

It definitely does NOT print the usual message about port 8123:

771  [Thread-2] INFO  org.wikimedia.lsearch.frontend.SearchServer  - Searcher started on port 8123

Any tips? Where do I start looking? This is a critical site for our business with thousands of users daily. Thanks.

--Maiden taiwan 03:54, 12 December 2011 (UTC)

Maiden taiwan (talkcontribs)

I should mention that the "yum update" was NOT for lucene-search, nor for Java. Just for core CentOS Linux packages. Maiden taiwan 04:00, 12 December 2011 (UTC)

Maiden taiwan (talkcontribs)

Changing the port number in lsearch.conf does not affect the problem. Maiden taiwan 04:07, 12 December 2011 (UTC)

Rainman (talkcontribs)

Did your hostname change somehow? If there was a conflict, it would print out an error message. It seems like it doesn't even want to start a searcher because it might think this is not the right host to start it up?

Maiden taiwan (talkcontribs)

The hostname is still the same. Maiden taiwan 12:43, 12 December 2011 (UTC)

Rainman (talkcontribs)

Well don't know then. My hunch is that there is something wrong with how the hostname is understood. Have you tried calling java with:

-Djava.rmi.server.hostname=<your hostname, not localhost!>

And then use the same hostname in your configuration files?

Maiden taiwan (talkcontribs)

Thanks for the tip. lsearchd currently runs this line:

java -Djava.rmi.server.codebase=file://$jardir/LuceneSearch.jar \
-Djava.rmi.server.hostname=$HOSTNAME -jar $jardir/LuceneSearch.jar $*

and $HOSTNAME = the correct value: I ran "ps uax" and saw it. Maiden taiwan 16:01, 12 December 2011 (UTC)

Maiden taiwan (talkcontribs)

I ran an "strace -v" on lsearchd, and its calls to uname({sysname="Linux", nodename="mysystem", ...) are return success (zero), so I believe this shows it's looking up the right hostname.

Rainman (talkcontribs)

Well, don't know then, sorry.

Maiden taiwan (talkcontribs)

OK, I downloaded the lucene-search Java source, added some debug output, and recompiled it. Here is more data.

  • On our system, MediaWiki runs on one server (wikihost) and Lucene on another (km105).
  • The problem is that GlobalConfiguration.isSearcher() is returning false.
  • In GlobalConfiguration, the hostAddr and hostName variables are set correctly to 127.0.0.1 and the true hostname (km105). However, in the isSearcher function body, both search.get(hostAddr) and search.get(hostName) are null.
  • On a different working Lucene system in my company, search.get(hostName) is non-null.
  • I see only one line where the "search" hashtable gets set, in function processSearchRoles (search.put(host,hostroles)). I added some debugging, and this is getting called only for the MediaWiki host (wikihost) and not for the search host (km105).

Here is the debug output of lsearchd:

Trying config file at path /home/danb/.lsearch.conf
Trying config file at path /home/danb/src/lsearch.conf
setHost hostAddr = 127.0.0.1
setHost hostName = km105
0    [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En
755  [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound
isIndexer hostAddr = 127.0.0.1
isIndexer hostName = km105
isIndexer index.get hostAddr = null
isIndexer index.get hostName = [*]
isSearcher hostAddr = 127.0.0.1
isSearcher hostName = km105
isSearcher search.get hostAddr = null
isSearcher search.get hostName = null    (NOTE: This seems to be the problem.)
758  [Thread-1] INFO  org.wikimedia.lsearch.frontend.HTTPIndexServer  - Indexer started on port 8321

Here is lsearch-global.conf, which is unchanged since before the problem started:

################################################
# Global search cluster layout configuration
################################################

[Database]
wikidb : (single) (spell,4,2) (language,en)

[Search-Group]
wikihost : *

[Index]
km105 : *

[Index-Path]
<default> : /search

[OAI]
<default> : http://wikihost/w/index.php

[Namespace-Boost]
<default> : (0,2) (1,0.5)

[Namespace-Prefix]
all : <all>
[0] : 0
[1] : 1
[2] : 2
[3] : 3
[4] : 4
[5] : 5
[6] : 6
[7] : 7
[8] : 8
[9] : 9
[10] : 10
[11] : 11
[12] : 12
[13] : 13
[14] : 14
[15] : 15

And here is lsearch.conf:

MWConfig.global=file:///usr/local/lucene-search.2.1.3/lsearch-global.conf
Indexes.path=/usr/local/lucene-search-2.1.3/indexes
Rsync.path=/usr/bin/rsync
...
(the rest of the file is unchanged from the default)

Can you suggest any other debug output I can add to Lucene so it helps find the problem?

Maiden taiwan (talkcontribs)

I got everything working again. Part of it was my error -- km105 isn't supposed to be a search query service, just an index generator. So the above debug behavior is correct.

Cboltz (talkcontribs)

I just had the same problem after moving some wikis and lucene to a new server.

The solution: lsearch-global.conf contains the hostname. After changing it to the content of $HOSTNAME, it works again. Interestingly, on the old server I had to use the full hostname (hostname -f output) while the new server needs the short hostname (hostname -s output).

This is "just for the records" in case someone hits the same problem ;-)

Reply to "Help - port 8123 refusing all connections after Linux update & reboot"