Help - port 8123 refusing all connections after Linux update & reboot
After a "yum update" on our Linux server, and a reboot, Lucene is no longer listening on port 8123. We have not changed any Lucene config files.
$ telnet localhost 8123 Trying 127.0.0.1... telnet: connect to address 127.0.0.1: Connection refused telnet: Unable to connect to remote host: Connection refused
lsearchd is running and is listening on port 8321 for incremental reindexes. Java is running as well. When I start lsearchd manually, it says:
sudo /usr/local/bin/lucene-run RMI registry started. Trying config file at path /root/.lsearch.conf Trying config file at path /usr/local/lucene-search-2.1.3/lsearch.conf 0 [main] INFO org.wikimedia.lsearch.util.Localization - Reading localization for En 727 [main] INFO org.wikimedia.lsearch.interoperability.RMIServer - RMIMessenger bound 730 [Thread-1] INFO org.wikimedia.lsearch.frontend.HTTPIndexServer - Indexer started on port 8321
It definitely does NOT print the usual message about port 8123:
771 [Thread-2] INFO org.wikimedia.lsearch.frontend.SearchServer - Searcher started on port 8123
Any tips? Where do I start looking? This is a critical site for our business with thousands of users daily. Thanks.
--Maiden taiwan 03:54, 12 December 2011 (UTC)
I should mention that the "yum update" was NOT for lucene-search, nor for Java. Just for core CentOS Linux packages. Maiden taiwan 04:00, 12 December 2011 (UTC)
Changing the port number in lsearch.conf does not affect the problem. Maiden taiwan 04:07, 12 December 2011 (UTC)
Did your hostname change somehow? If there was a conflict, it would print out an error message. It seems like it doesn't even want to start a searcher because it might think this is not the right host to start it up?
The hostname is still the same. Maiden taiwan 12:43, 12 December 2011 (UTC)
Well don't know then. My hunch is that there is something wrong with how the hostname is understood. Have you tried calling java with:
-Djava.rmi.server.hostname=<your hostname, not localhost!>
And then use the same hostname in your configuration files?
Thanks for the tip. lsearchd currently runs this line:
java -Djava.rmi.server.codebase=file://$jardir/LuceneSearch.jar \ -Djava.rmi.server.hostname=$HOSTNAME -jar $jardir/LuceneSearch.jar $*
and $HOSTNAME = the correct value: I ran "ps uax" and saw it. Maiden taiwan 16:01, 12 December 2011 (UTC)
I ran an "strace -v" on lsearchd, and its calls to uname({sysname="Linux", nodename="mysystem", ...) are return success (zero), so I believe this shows it's looking up the right hostname.
Well, don't know then, sorry.
OK, I downloaded the lucene-search Java source, added some debug output, and recompiled it. Here is more data.
- On our system, MediaWiki runs on one server (wikihost) and Lucene on another (km105).
- The problem is that GlobalConfiguration.isSearcher() is returning false.
- In GlobalConfiguration, the hostAddr and hostName variables are set correctly to 127.0.0.1 and the true hostname (km105). However, in the isSearcher function body, both search.get(hostAddr) and search.get(hostName) are null.
- On a different working Lucene system in my company, search.get(hostName) is non-null.
- I see only one line where the "search" hashtable gets set, in function processSearchRoles (search.put(host,hostroles)). I added some debugging, and this is getting called only for the MediaWiki host (wikihost) and not for the search host (km105).
Here is the debug output of lsearchd:
Trying config file at path /home/danb/.lsearch.conf Trying config file at path /home/danb/src/lsearch.conf setHost hostAddr = 127.0.0.1 setHost hostName = km105 0 [main] INFO org.wikimedia.lsearch.util.Localization - Reading localization for En 755 [main] INFO org.wikimedia.lsearch.interoperability.RMIServer - RMIMessenger bound isIndexer hostAddr = 127.0.0.1 isIndexer hostName = km105 isIndexer index.get hostAddr = null isIndexer index.get hostName = [*] isSearcher hostAddr = 127.0.0.1 isSearcher hostName = km105 isSearcher search.get hostAddr = null isSearcher search.get hostName = null (NOTE: This seems to be the problem.) 758 [Thread-1] INFO org.wikimedia.lsearch.frontend.HTTPIndexServer - Indexer started on port 8321
Here is lsearch-global.conf, which is unchanged since before the problem started:
################################################ # Global search cluster layout configuration ################################################ [Database] wikidb : (single) (spell,4,2) (language,en) [Search-Group] wikihost : * [Index] km105 : * [Index-Path] <default> : /search [OAI] <default> : http://wikihost/w/index.php [Namespace-Boost] <default> : (0,2) (1,0.5) [Namespace-Prefix] all : <all> [0] : 0 [1] : 1 [2] : 2 [3] : 3 [4] : 4 [5] : 5 [6] : 6 [7] : 7 [8] : 8 [9] : 9 [10] : 10 [11] : 11 [12] : 12 [13] : 13 [14] : 14 [15] : 15
And here is lsearch.conf:
MWConfig.global=file:///usr/local/lucene-search.2.1.3/lsearch-global.conf Indexes.path=/usr/local/lucene-search-2.1.3/indexes Rsync.path=/usr/bin/rsync ... (the rest of the file is unchanged from the default)
Can you suggest any other debug output I can add to Lucene so it helps find the problem?