Extension talk:Lucene-search/archive/2008

From mediawiki.org

2008[edit]

Compiling to create lucenesearch.jar failed[edit]

I am trying to install the lucene engine for our wiki but the compile of lucene fails.

Ant gives back a lot of error messages during the compilation, errors like:

   [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/SearchState.java:331: cannot find symbol
   [javac] symbol  : class Hits
   [javac] location: class org.wikimedia.lsearch.SearchState
   [javac]             Hits hits = searcher.search(new TermQuery(
   [javac]                 ^
   [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/SearchState.java:331: cannot find symbol
   [javac] symbol  : class TermQuery
   [javac] location: class org.wikimedia.lsearch.SearchState
   [javac]             Hits hits = searcher.search(new TermQuery(
   [javac]                                                 ^
   [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/SearchState.java:332: cannot find symbol
   [javac] symbol  : class Term
   [javac] location: class org.wikimedia.lsearch.SearchState
   [javac]                             new Term("key", key)));
   [javac]                                     ^
   [javac] Note: Some input files use unchecked or unsafe operations.
   [javac] Note: Recompile with -Xlint:unchecked for details.
   [javac] 85 errors

Can you help me to solve these error messages or provide a binary?

Many thanks in advance. --Phaidros 12 January 2008

Are you compiling with a Sun Java 1.5+ compiler? If so, can you provide the beginning of the error log? --Rainman 00:55, 13 January 2008 (UTC)Reply

Yes, I am using the opensuse 10.3 distribution and javac 1.5.0_13. I hope that the error messages I provide below are enough, sorry for my low experience in java build processes.

I´ll provide the first part and the last messages here:

Apache Ant version 1.7.0 compiled on September 22 2007
Buildfile: build.xml
Detected Java version: 1.5 in: /usr/lib/jvm/java-1.5.0-sun-1.5.0_update13-sr2/jre
Detected OS: Linux
parsing buildfile /root/lucene/lucene-search/build.xml with URI = file:/root/lucene/lucene-search/build.xml
Project base dir set to: /root/lucene/lucene-search
[antlib:org.apache.tools.ant] Could not load definitions from resource org/apache/tools/ant/antlib.xml. It could not be found.
 [property] Loading /root/lucene-search.build.properties
 [property] Unable to find property file: /root/lucene-search.build.properties
 [property] Loading /root/build.properties
 [property] Unable to find property file: /root/build.properties
 [property] Loading /root/lucene/lucene-search/build.properties
 [property] Unable to find property file: /root/lucene/lucene-search/build.properties
Property "current.year" has not been set
Build sequence for target(s) `default' is [init, compile-core, compile, default]
Complete build sequence is [init, compile-core, compile, default, package-tgz-src, jar-core, javadocs, package, package-zip, package-tgz, package-all-binary, dist, package-zip-src, package-all-src, dist-src, dist-all, jar, jar-src, clean, ]

init:
    [mkdir] Skipping /root/lucene/lucene-search/bin because it already exists.
    [mkdir] Skipping /root/lucene/lucene-search/dist because it already exists.

compile-core:
    [mkdir] Skipping /root/lucene/lucene-search/bin because it already exists.
    [javac] wikimedia/lsearch/Article.java added as wikimedia/lsearch/Article.class doesn't exist.
    [javac] wikimedia/lsearch/ArticleList.java added as wikimedia/lsearch/ArticleList.class doesn't exist.
    [javac] wikimedia/lsearch/Configuration.java added as wikimedia/lsearch/Configuration.class doesn't exist.
    [javac] wikimedia/lsearch/DatabaseConnection.java added as wikimedia/lsearch/DatabaseConnection.class doesn't exist.
    [javac] wikimedia/lsearch/EnglishAnalyzer.java added as wikimedia/lsearch/EnglishAnalyzer.class doesn't exist.
    [javac] wikimedia/lsearch/EsperantoAnalyzer.java added as wikimedia/lsearch/EsperantoAnalyzer.class doesn't exist.
    [javac] wikimedia/lsearch/EsperantoStemFilter.java added as wikimedia/lsearch/EsperantoStemFilter.class doesn't exist.
    [javac] wikimedia/lsearch/MWDaemon.java added as wikimedia/lsearch/MWDaemon.class doesn't exist.
    [javac] wikimedia/lsearch/MWSearch.java added as wikimedia/lsearch/MWSearch.class doesn't exist.
    [javac] wikimedia/lsearch/NamespaceFilter.java added as wikimedia/lsearch/NamespaceFilter.class doesn't exist.
    [javac] wikimedia/lsearch/QueryStringMap.java added as wikimedia/lsearch/QueryStringMap.class doesn't exist.
    [javac] wikimedia/lsearch/SearchClientReader.java added as wikimedia/lsearch/SearchClientReader.class doesn't exist.
    [javac] wikimedia/lsearch/SearchDbException.java added as wikimedia/lsearch/SearchDbException.class doesn't exist.
    [javac] wikimedia/lsearch/SearchState.java added as wikimedia/lsearch/SearchState.class doesn't exist.
    [javac] wikimedia/lsearch/Title.java added as wikimedia/lsearch/Title.class doesn't exist.
    [javac] wikimedia/lsearch/TitlePrefixMatcher.java added as wikimedia/lsearch/TitlePrefixMatcher.class doesn't exist.
    [javac] Compiling 16 source files to /root/lucene/lucene-search/bin
    [javac] Using modern compiler
dropping /root/lucene/lucene-search/bin/bin from path as it doesn't exist
    [javac] Compilation arguments:
    [javac] '-deprecation'
    [javac] '-d'
    [javac] '/root/lucene/lucene-search/bin'
    [javac] '-classpath'
    [javac] '/root/lucene/lucene-search/bin:/usr/share/java/ant.jar:/usr/share/java/ant-launcher.jar:/usr/share/java/jaxp_parser_impl.jar:/usr/share/java/xml-commons-apis.jar:/usr/share/java/ant/ant-antlr.jar:/usr/share/java/bcel.jar:/usr/share/java/ant/ant-apache-bcel.jar:/usr/share/java/bsf.jar:/usr/share/java/ant/ant-apache-bsf.jar:/usr/share/java/log4j.jar:/usr/share/java/ant/ant-apache-log4j.jar:/usr/share/java/oro.jar:/usr/share/java/ant/ant-apache-oro.jar:/usr/share/java/regexp.jar:/usr/share/java/ant/ant-apache-regexp.jar:/usr/share/java/xml-commons-resolver.jar:/usr/share/java/ant/ant-apache-resolver.jar:/usr/share/java/jakarta-commons-logging.jar:/usr/share/java/ant/ant-commons-logging.jar:/usr/share/java/javamail.jar:/usr/share/java/jaf.jar:/usr/share/java/ant/ant-javamail.jar:/usr/share/java/jdepend.jar:/usr/share/java/ant/ant-jdepend.jar:/usr/share/java/ant/ant-jmf.jar:/usr/share/java/junit.jar:/usr/share/java/ant/ant-junit.jar:/usr/share/java/ant/ant-nodeps.jar:/usr/lib/jvm/java/lib/tools.jar:/usr/share/ant/lib/ant-apache-resolver-1.7.0.jar:/usr/share/ant/lib/ant-apache-bsf.jar:/usr/share/ant/lib/ant-nodeps.jar:/usr/share/ant/lib/ant-commons-logging.jar:/usr/share/ant/lib/ant-junit.jar:/usr/share/ant/lib/ant-javamail-1.7.0.jar:/usr/share/ant/lib/ant-junit-1.7.0.jar:/usr/share/ant/lib/ant-launcher.jar:/usr/share/ant/lib/ant-apache-log4j.jar:/usr/share/ant/lib/ant-apache-oro-1.7.0.jar:/usr/share/ant/lib/ant-javamail.jar:/usr/share/ant/lib/ant-apache-log4j-1.7.0.jar:/usr/share/ant/lib/ant-apache-bcel-1.7.0.jar:/usr/share/ant/lib/ant-nodeps-1.7.0.jar:/usr/share/ant/lib/ant-jmf.jar:/usr/share/ant/lib/ant-jmf-1.7.0.jar:/usr/share/ant/lib/ant-commons-logging-1.7.0.jar:/usr/share/ant/lib/ant-jdepend-1.7.0.jar:/usr/share/ant/lib/ant-1.7.0.jar:/usr/share/ant/lib/ant-apache-regexp.jar:/usr/share/ant/lib/ant-apache-oro.jar:/usr/share/ant/lib/ant-apache-resolver.jar:/usr/share/ant/lib/ant-jdepend.jar:/usr/share/ant/lib/ant-antlr.jar:/usr/share/ant/lib/ant-antlr-1.7.0.jar:/usr/share/ant/lib/ant-apache-regexp-1.7.0.jar:/usr/share/ant/lib/ant-apache-bcel.jar:/usr/share/ant/lib/ant-apache-bsf-1.7.0.jar:/usr/share/ant/lib/ant-launcher-1.7.0.jar:/usr/share/ant/lib/ant.jar'
    [javac] '-sourcepath'
    [javac] '/root/lucene/lucene-search/org'
    [javac] '-encoding'
    [javac] 'utf-8'
    [javac] '-g'
    [javac]
    [javac] The ' characters around the executable and arguments are
    [javac] not part of the command.
    [javac] Files to be compiled:
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/Article.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/ArticleList.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/Configuration.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/DatabaseConnection.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/MWDaemon.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/MWSearch.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/NamespaceFilter.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/QueryStringMap.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/SearchClientReader.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/SearchDbException.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/SearchState.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/Title.java
    [javac]     /root/lucene/lucene-search/org/wikimedia/lsearch/TitlePrefixMatcher.java
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:28: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.Analyzer;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:29: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.LowerCaseTokenizer;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:30: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.PorterStemFilter;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:31: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.TokenStream;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:37: cannot find symbol
    [javac] symbol: class Analyzer
    [javac] public class EnglishAnalyzer extends Analyzer {
    [javac]                                      ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:38: cannot find symbol
    [javac] symbol  : class TokenStream
    [javac] location: class org.wikimedia.lsearch.EnglishAnalyzer
    [javac]     public final TokenStream tokenStream(String fieldName, Reader reader) {
    [javac]                      ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:31: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.Analyzer;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:32: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.LowerCaseTokenizer;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:33: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.Token;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:34: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.TokenStream;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:36: cannot find symbol
    [javac] symbol: class Analyzer
    [javac] public class EsperantoAnalyzer extends Analyzer{
    [javac]                                        ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:37: cannot find symbol
    [javac] symbol  : class TokenStream
    [javac] location: class org.wikimedia.lsearch.EsperantoAnalyzer
    [javac]     public final TokenStream tokenStream(String fieldName, Reader reader) {
    [javac]                      ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java:31: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.Token;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java:32: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.TokenStream;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java:33: package org.apache.lucene.analysis does not exist
    [javac] import org.apache.lucene.analysis.TokenFilter;
    [javac]                                   ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java:36: cannot find symbol
    [javac] symbol: class TokenFilter
    [javac] public class EsperantoStemFilter extends TokenFilter {
    [javac]                                          ^
    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java:37: cannot find symbol
    [javac] symbol  : class TokenStream
    [javac] location: class org.wikimedia.lsearch.EsperantoStemFilter
    [javac]     public EsperantoStemFilter(TokenStream tokenizer) {

--- snipp ---
cutted some lines here
--- snipp ---

    [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/SearchState.java:332: cannot find symbol
    [javac] symbol  : class Term
    [javac] location: class org.wikimedia.lsearch.SearchState
    [javac]                             new Term("key", key)));
    [javac]                                     ^
    [javac] Note: Some input files use unchecked or unsafe operations.
    [javac] Note: Recompile with -Xlint:unchecked for details.
    [javac] 85 errors

BUILD FAILED
/root/lucene/lucene-search/build.xml:55: Compile failed; see the compiler error output for details.
        at org.apache.tools.ant.taskdefs.Javac.compile(Javac.java:999)
        at org.apache.tools.ant.taskdefs.Javac.execute(Javac.java:820)
        at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:105)
        at org.apache.tools.ant.Task.perform(Task.java:348)
        at org.apache.tools.ant.Target.execute(Target.java:357)
        at org.apache.tools.ant.Target.performTasks(Target.java:385)
        at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1329)
        at org.apache.tools.ant.Project.executeTarget(Project.java:1298)
        at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
        at org.apache.tools.ant.Project.executeTargets(Project.java:1181)
        at org.apache.tools.ant.Main.runBuild(Main.java:698)
        at org.apache.tools.ant.Main.startAnt(Main.java:199)
        at org.apache.tools.ant.launch.Launcher.run(Launcher.java:257)
        at org.apache.tools.ant.launch.Launcher.main(Launcher.java:104)

--Phaidros 24 January 2008

Looks like your ant is broken and cannot find the relevant libraries. I've compiled the package and put it here.--Rainman 11:01, 26 January 2008 (UTC)Reply

Cannot bind RMIMessenger exception: non-JRMP server at remote endpoint[edit]

Hello everyone,

I'm quite new in Lucene stuff and I have a problem. I can't get Lucene Java working on one of my server. I've setup it on another server for Mediawiki and it works fine.

It's a GNU/Linux Ubuntu Edgy i686 with kernel 2.6.17-11-server running Apache 2.0 with PHP5 for Mediawiki, some others stuffs like Tomcat & Jboss. Got Java installed : j2re1.4, j2sdk1.4, java-common, libgcj-common, sun-java5-bin , sun-java5-demo , sun-java5-jdk and sun-java5-jre

In the case of the first server (fresh Ubuntu Gutsy 64bits with almost anything running) it worked fine, I can use Lucene to search into my Wiki. In the case of my second server, here is the error when I would like to start the engine :

www-data@myserver:/usr/local/search/ls2$ ./lsearchd
.
Trying config file at path /var/www/.lsearch.conf
Trying config file at path /usr/local/search/ls2/lsearch.conf
0 [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer
java.rmi.ConnectIOException: non-JRMP server at remote endpoint
 at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:217)
 at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:171)
 at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:306)
 at sun.rmi.registry.RegistryImpl_Stub.rebind(Unknown Source)
 at org.wikimedia.lsearch.interoperability.RMIServer.register(RMIServer.java:24)
 at org.wikimedia.lsearch.interoperability.RMIServer.bindRMIObjects(RMIServer.java:60)
 at org.wikimedia.lsearch.config.StartupManager.main(StartupManager.java:52)
76   [main] WARN  org.wikimedia.lsearch.interoperability.RMIServer  - Cannot
bind RMIMessenger exception:non-JRMP server at remote endpoint

But NOTHING use the port 8321. I've tried to use another port, it's the same problem. Any ideas how to solve this problem please? Here is my contact :

Thanks, LMJ 15 January 2008

First verify that jboss, tomcat and lsearchd all run under sun-java5-bin (and not j2re1.4). If this is the case then maybe the RMI registry is colliding with jboss (so try stopping it if you can). If this appears to be the case, then you can either configure jboss not to use the port 1099, or edit RMIRegistry.java to use a different port (replace 1099 there with your port, and provide the port as param to getRegistry() calls in RMIRegistry.java and RMIMessengerClient.java). --Rainman 15:05, 15 January 2008 (UTC)Reply
Indeed Rainman, thanks for your help! look at this :
# lsof +i :1099
COMMAND   PID    USER   FD   TYPE    DEVICE SIZE NODE NAME

java    20832 syncron    7u  IPv4 149877937       TCP *:rmiregistry (LISTEN)

The port is used by Jboss rmiregistry :-/ I need some extra help to change that port. Can we exchange emails about it Rainman? I tried to contact you via your personal page but I just read English & French ;) --16 January 2008

I've edited /usr/local/jboss-3.2.7/server/default/conf/jboss-service.xml and change to port to 10999. It seems to work better ;) Got another problem but it seems to be lsearch.conf related issue. --22 January 2008

Daemon status[edit]

On the German Wikipedia, I am often irritated because changes in content are not reflected immediately by the full text search and – at the moment – I cannot see whether and when the changes have already or will be processed by the daemon. Therefore, I would like to know:

  • whether the daemon processes the changes chronologically so one could be certain that if one's changes were made at time T and the daemon has processed all changes up to T + 1, they will be reflected in the full text search, and
  • whether there is any way to obtain the daemon status (all changes up to T, n articles in queue, etc.) from a current or future Wikipedia installation.

Thanks, Tim Landscheidt 19:52, 7 February 2008 (UTC)Reply

The index is updated around 5 am GMT every day on wikimedia projects (when nothing goes wrong which is most of the time). About 1) - yes, it processes the changes chronologically. 2) - this interface is available but only for system admins, for everybody else - just wait till tomorrow for changes to be applied. --Rainman 10:07, 8 February 2008 (UTC)Reply
Hmmm. If I search for "Lassithi" (note the double "s") now, I see that changes in de:Panagia i Kera (8 days ago), de:Kritsa (7 days ago), de:Ierapetra (10 days ago), de:Kera Kardiotissa (11 days ago), de:Griechische Toponyme (11 days ago), de:Venezianische Kolonien (9 days ago) and de:Sitia (11 days ago) have not been processed. Is that what you mean by "when nothing goes wrong"? :-) Would it be technically feasible to include the last time a change was successfully worked into the index in the result page, i. e. "All changes until T considered."? Tim Landscheidt 17:24, 8 February 2008 (UTC)Reply
Yes, this seems to be a case of "if nothing is broken" :) one of the dewiki search servers (srv21) is broken and stopped updating its index and seems to have a broken logrotate and possibly some other things. We'll fix it when a sysadmin become available. Whenever you see changes not going in for more than a couple of days you should report it. --Rainman 18:07, 8 February 2008 (UTC)Reply
Ok, we tracked this down to a hard drive failure on srv21, now one just needs to wait for cache to expire (~12h) and you should get fresh results - thanks for the report! --Rainman 18:57, 8 February 2008 (UTC)Reply
Thanks for the information :-). What would be the proper place to report such things in the future? Tim Landscheidt 21:32, 8 February 2008 (UTC)Reply
Technical issues are usually reported via IRC channel #wikimedia-tech where all of the sysadmins are. If there's no-one online to fix the problem then you could submit a bug. You could also send me an e-mail via this wiki or leave a message on my talk page, since I'm more-or-less in change of maintaining the search subsystem. --Rainman 21:44, 8 February 2008 (UTC)Reply
Okay, I'll keep that in mind. Thanks again, Tim Landscheidt 22:53, 8 February 2008 (UTC)Reply

Query String Syntax[edit]

Please document the subset of Lucene query string syntax that has been implemented.
-- 216.143.51.66 22:52, 8 February 2008 (UTC)Reply

Error running the Daemon[edit]

#  . lsearchd
RMI registry started.
Trying config file at path /root/.lsearch.conf
Trying config file at path /var/www/vhosts/kidneycancerknol.com/httpdocs/Lucene/ls2-bin/lsearch.conf
0    [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En
530  [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer
602  [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound
619  [main] ERROR org.wikimedia.lsearch.search.SearcherCache  - I/O Error opening index at path /var/www/vhosts/kidneycancerknol.com/httpdocs/Lucene/ls2-bin/indexes/search/kck_wiki : /var/www/vhosts/kidneycancerknol.com/httpdocs/Lucene/ls2-bin/indexes/search/kck_wiki/segments (No such file or directory)
621  [main] ERROR org.wikimedia.lsearch.search.SearcherCache  - I/O Error opening index at path /var/www/vhosts/kidneycancerknol.com/httpdocs/Lucene/ls2-bin/indexes/search/kck_wiki : /var/www/vhosts/kidneycancerknol.com/httpdocs/Lucene/ls2-bin/indexes/search/kck_wiki/segments (No such file or directory)
621  [main] WARN  org.wikimedia.lsearch.search.SearcherCache  - I/O error warming index for kck_wiki
621  [Thread-3] INFO  org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123
623  [Thread-2] INFO  org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8321

I'm getting this error saying no file or directory. The directory exists, owever I don't know where the "segments" file comes from

I ran this to create the indexes

   php maintenance/dumpBackup.php --current --quiet > wikidb.xml &&
   java -cp LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s wikidb.xml kck_wiki

The wikidb.xml file exists in the httpdocs directory

...and then I started the deamon

Am I missing a trick?

Thanks

Andy Andy.thomas 19 February 2008

And what is the output from the importer? It should give you a success messages that it created the indexes and successfully made a snapshot. --Rainman 01:30, 20 February 2008 (UTC)Reply

I'm most likely doing something dumb (being a bit of a newbie) but This is what I get when I just run the

java -cp LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s wikidb.xml kck_wiki
Exception in thread "main" java.lang.NoClassDefFoundError: org/wikimedia/lsearch/importer/Importer

--Andy 17:00, 20 February 2008 (GMT)

The java command you're running assumes that LuceneSearch.jar is in your current directory, the full command would be
java -cp /full/path/to/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s wikidb.xml kck_wiki
--Rainman 18:04, 20 February 2008 (UTC)Reply

I'm getting further thanks that helped. Sorry - I'm being dumb I know and I apologise for asking you to hand hold me in this way but I now get this

rying config file at path /root/.lsearch.conf
Trying config file at path /var/www/vhosts/kidneycancerknol.com/httpdocs/lsearch.conf
0    [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer
3    [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En
60   [main] INFO  org.wikimedia.lsearch.ranks.RankBuilder  - First pass, getting a list of valid articles...
175  [main] FATAL org.wikimedia.lsearch.ranks.RankBuilder  - I/O error reading dump while getting titles from wikidb.xml
175  [main] INFO  org.wikimedia.lsearch.ranks.RankBuilder  - Second pass, calculating article links...
179  [main] FATAL org.wikimedia.lsearch.ranks.RankBuilder  - I/O error reading dump while calculating ranks for from wikidb.xml
Exception in thread "main" java.lang.NullPointerException
	at org.wikimedia.lsearch.importer.Importer.main(Importer.java:114)

Do I need to set the OIA settings in the global config? I've just kept them s the default. --Andy 18:30, 20 February 2008 (GMT)

No, you don't need oai.. Seems to me something is wrong with the xml file .. sure would be helpful if exception weren't suppressed :\ unfortunately cannot help you much more than that.. is wikidb.xml a valid xml file? did you give full path to it? --Rainman 01:00, 21 February 2008 (UTC)Reply

Exception in thread "main" java.lang.UnsupportedClassVersionError[edit]

Hi
I use following configuration:

  • MediaWiki: 1.11.0
  • PHP: 5.2.5 (apache2handler)
  • MySQL: 5.0.51

If I call this:

java -cp /usr/local/search/ls2/ls2-bin/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s basiswikidb.xml basiswiki

I get the error:

Exception in thread "main" java.lang.UnsupportedClassVersionError: org/wikimedia/lsearch/importer/Importer (Unsupported major.minor version 49.0)
       at java.lang.ClassLoader.defineClass0(Native Method)
       at java.lang.ClassLoader.defineClass(ClassLoader.java:539)
       at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:123)
       at java.net.URLClassLoader.defineClass(URLClassLoader.java:251)
       at java.net.URLClassLoader.access$100(URLClassLoader.java:55)
       at java.net.URLClassLoader$1.run(URLClassLoader.java:194)
       at java.security.AccessController.doPrivileged(Native Method)
       at java.net.URLClassLoader.findClass(URLClassLoader.java:187)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:289)
       at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274)
       at java.lang.ClassLoader.loadClass(ClassLoader.java:235)
       at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:302)

My Configuration

  • all Files are in /usr/local/search/ls2/
  • MWConfig.global=file:///usr/local/search/ls2/lsearch-global.conf
  • MWConfig.lib=/usr/local/search/ls2/lib
  • Indexes.path=/usr/local/search/indexes
  • Localization.url=file:///opt/lampp/htdocs/basiswiki/languages/messages
  • Logging.logconfig=/usr/local/search/ls2/lsearch.log4j
  • mwdumper.jar => /usr/local/search/ls2/lib
  • lsearch.conf: Storage.lib=/usr/local/search/ls2/sql

lsearch-global.conf

[Database]
#wikilucene : (single) (language,en) (warmup,0)
wikidev : (single) (language,sr)
wikilucene : (nssplit,3) (nspart1,[0]) (nspart2,[4,5,12,13]), (nspart3,[])
wikilucene : (language,en) (warmup,10)
basiswiki : (single) (language,en) (warmup,10)
# Search groups
# Index parts of a split index are always taken from the node's group
# host : db1.part db2.part
# Mulitple hosts can search multiple dbs (N-N mapping)
[Search-Group]
<my host> : wikilucene wikidev
<my host> : basiswiki

Please can you help me?!

85.158.226.1 11:03, 31 March 2008 (UTC)Reply

Run java -version. I probably have old java, you need to update to 1.5 or later. --Rainman 11:57, 31 March 2008 (UTC)Reply

MediaWiki+Lucene-Search+MWSearch = ZERO search results ??!@#?![edit]

Can someone please assist me? =)

  • Slackware 12.0, on i686 Pentium III [Linux 2.6.21.5]
  • MediaWiki: 1.9.1
  • PHP: 5.2.5 (apache2handler)
  • MySQL: 5.0.37
  • MediaWiki Extension(s): MWSearch SVN (05122008), and Lucene-search SVN (05122008), + I downloaded & installed mwdumper.jar into the Lucene2 lib dir.
  • other tools: jre-6u2-i586-1, jdk-1_5_0_09-i586-1, apache-ant-1.7.0-i586-1bj, rsync-2.6.9-i486-1

I've followed the steps per Extension:Lucene-search and Extension:MWSearch pages, to the T - I've gone over and over them several times, I've been to MediaWiki Forums, and the MediaWiki-L mailing list ... please help me! =)

My Local LuceneSearch configuration

  • LuceneSearch SVN Install dir: /usr/local/search/lucene-search-2svn05112008
  • Indexes stored: /usr/local/search/indexes

/etc/lsearch.conf

MWConfig.global=file:///etc/lsearch-global.conf
MWConfig.lib=/usr/local/search/lucene-search-2svn05112008/lib
Indexes.path=/usr/local/search/indexes
Search.updateinterval=1
Search.updatedelay=0
Search.checkinterval=30
Index.snapshotinterval=5
Index.maxqueuecount=5000
Index.maxqueuetimeout=12
Storage.master=localhost
Storage.username=wikiuser
Storage.password=mypass
Storage.useSeparateDBs=false
Storage.defaultDB=wikidb
Storage.lib=/usr/local/search/lucene-search-2svn05112008/sql
Localization.url=file:///var/www/htdocs/wiki/languages/messages
Logging.logconfig=/etc/lsearch.log4j
Logging.debug=true

/etc/lsearch-global.conf

[Database]
wikidb : (single) (language,en) (warmup,10)
[Search-Group]
nen-tftp : wikidb
[Index]
nen-tftp : wikidb
[Index-Path]
<default> : /usr/local/search/indexes
[OAI]
wiktionary : http://$lang.wiktionary.org/w/index.php
wikilucene : http://localhost/wiki-lucene/phase3/index.php
<default> : http://$lang.wikipedia.org/w/index.php
[Properties]
Database.suffix=wiki wiktionary wikidb
KeywordScoring.suffix=wikidb wiki wikilucene wikidev
ExactCase.suffix=wikidb wiktionary wikilucene
[Namespace-Prefix]
all : <all>
[0] : 0
[1] : 1
[2] : 2
[3] : 3
[4] : 4
[5] : 5
[6] : 6
[7] : 7
[8] : 8
[9] : 9
[10] : 10
[11] : 11
[12] : 12
[13] : 13
[14] : 14
[15] : 15

/etc/lsearch.log4j

log4j.rootLogger=INFO, A1
log4j.appender.A1=org.apache.log4j.ConsoleAppender
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n

relevant /var/www/htdocs/wiki/LocalSettings.php settings

$wgSearchType = 'LuceneSearch';
$wgLuceneHost = 'localhost';
$wgLucenePort = 8123;
require_once("extensions/MWSearch/MWSearch.php");

building the index works running dumpBackup(Init).php

> php maintenance/dumpBackupInit.php --current --quiet > wikidb.xml && java -cp /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s /var/www/htdocs/wiki/wikidb.xml wikidb
MediaWiki Lucene search indexer - index builder from xml database dumps.

Trying config file at path /root/.lsearch.conf
Trying config file at path /var/www/htdocs/wiki/lsearch.conf
Trying config file at path /etc/lsearch.conf
log4j: Trying to find [log4j.xml] using context classloader sun.misc.Launcher$AppClassLoader@133056f.
log4j: Trying to find [log4j.xml] using sun.misc.Launcher$AppClassLoader@133056f class loader.
log4j: Trying to find [log4j.xml] using ClassLoader.getSystemResource().
log4j: Trying to find [log4j.properties] using context classloader sun.misc.Launcher$AppClassLoader@133056f.
log4j: Trying to find [log4j.properties] using sun.misc.Launcher$AppClassLoader@133056f class loader.
log4j: Trying to find [log4j.properties] using ClassLoader.getSystemResource().
log4j: Could not find resource: [null].
log4j: Parsing for [root] with value=[INFO, A1].
log4j: Level token is [INFO].
log4j: Category root set to INFO
log4j: Parsing appender named "A1".
log4j: Parsing layout options for "A1".
log4j: Setting property [conversionPattern] to [%-4r [%t] %-5p %c %x - %m%n].
log4j: End of parsing for "A1".
log4j: Parsed "A1" options.
log4j: Finished configuring.
0    [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer
18   [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En
434  [main] INFO  org.wikimedia.lsearch.ranks.RankBuilder  - First pass, getting a list of valid articles...
94 pages (99.576/sec), 94 revs (99.576/sec)
1527 [main] INFO  org.wikimedia.lsearch.ranks.RankBuilder  - Second pass, calculating article links...
94 pages (326.389/sec), 94 revs (326.389/sec)
1928 [main] INFO  org.wikimedia.lsearch.importer.Importer  - Third pass, indexing articles...
94 pages (24.588/sec), 94 revs (24.588/sec)
6005 [main] INFO  org.wikimedia.lsearch.importer.Importer  - Closing/optimizing index...
Finished indexing in 5s, with final index optimization in 0s
Total time: 6s
6530 [main] INFO  org.wikimedia.lsearch.index.IndexThread  - Making snapshot for wikidb
6582 [main] INFO  org.wikimedia.lsearch.index.IndexThread  - Made snapshot /usr/local/search/indexes/snapshot/wikidb/20080512024654

That creates a 277KB file @ /var/www/htdocs/wiki/wikidb.xml , which looks just fine to me...
Starting the lsearch daemon is working When I run my script /usr/local/search/lucene-search-2svn05112008/lsearchd - which starts the lsearch deamon, I get the following, which ALSO looks fine ;

java -Djava.rmi.server.codebase=file:///usr/local/search/lucene-search-2svn05112008/LuceneSeah.jar -Djava.rmi.server.hostname=nen-tftp -jar /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar $*
RMI registry started.
Trying config file at path /root/.lsearch.conf
Trying config file at path /usr/local/search/lucene-search-2svn05112008/lsearch.conf
log4j: Parsing for [root] with value=[INFO, A1].
log4j: Level token is [INFO].
log4j: Category root set to INFO
log4j: Parsing appender named "A1".
log4j: Parsing layout options for "A1".
log4j: Setting property [conversionPattern] to [%-4r [%t] %-5p %c %x - %m%n].
log4j: End of parsing for "A1".
log4j: Parsed "A1" options.
log4j: Finished configuring.
0    [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En
2351 [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer
2600 [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound
2882 [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RemoteSearchable<wikidb> bound
2914 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ...
2928 [Thread-2] INFO  org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8321
2929 [Thread-3] INFO  org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123
4246 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 1331 ms
4246 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ...
5079 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 833 ms
5079 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ...
5861 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 782 ms

From here, I pull up my normal wiki, which has been working fine ALL along - but now, I get ZERO search results, no matter what I do! I know I am searching correctly, I just type in 1 single word for searching (that I know is on several pages in the wiki) I've even tried to edit the file before and after building the index, and starting/stoping the lsearch daemon, yet I get this error in my MediaWiki search results page;

Search results
From AgentDcooper's Wiki

You searched for wiki

For more information about searching AgentDcooper's Wiki, see Searching AgentDcooper's Wiki.

Showing below 0 results starting with #1.
No page text matches

Note: Unsuccessful searches are often caused by searching for common words like "have" and "from", which are not indexed, or by specifying more than one search term (only pages containing all of the search terms will appear in the result). 

I notice that the lsearch daemon console output scrolls the following; right after doing a search within the wiki

293744 [pool-2-thread-1] INFO  org.wikimedia.lsearch.frontend.HttpHandler  - query:/search/wikidb/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10 what:search dbname:wikidb term:wiki
293759 [pool-2-thread-1] INFO  org.wikimedia.lsearch.search.SearchEngine  - Using NamespaceFilterWrapper wrap: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
293786 [pool-2-thread-1] INFO  org.wikimedia.lsearch.search.SearchEngine  - search wikidb: query=[wiki] parsed=[contents:wiki (title:wiki^6.0 stemtitle:wiki^2.0) (alttitle1:wiki^4.0 alttitle2:wiki^4.0 alttitle3:wiki^4.0) (keyword1:wiki^0.02 keyword2:wiki^0.01 keyword3:wiki^0.0066666664 keyword4:wiki^0.0050 keyword5:wiki^0.0039999997)] hit=[27] in 16ms using IndexSearcherMul:1210585609666

With Mediawiki Debuging enabled, my /var/log/mediawiki/debug_log.txt shows this

Start request
GET /wiki/index.php/Special:Search?search=wiki&fulltext=Search
Host: nen-tftp.techiekb.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.
5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://nen-tftp.techiekb.com/wiki/index.php/Special:Version
Cookie: wikidb_session=3jptdli2pf3nkuq924tq1ihlt0
Authorization: Basic ZGNvb3Blcjp0ZXN0cGFzcw==

Main cache: FakeMemCachedClient
Message cache: MediaWikiBagOStuff
Parser cache: MediaWikiBagOStuff
Unstubbing $wgParser on call of $wgParser->setHook from require_once
Fully initialised
Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal
Language::loadLocalisation(): got localisation for en from source
Unstubbing $wgUser on call of $wgUser->isAllowed from Title::userCanRead
Cache miss for user 2
Unstubbing $wgLoadBalancer on call of $wgLoadBalancer->getConnection from wfGetDB
Logged in from session
Unstubbing $wgMessageCache on call of $wgMessageCache->getTransform from wfMsgGetKey
Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get
MessageCache::load(): got from global cache
Unstubbing $wgOut on call of $wgOut->setPageTitle from SpecialSearch::setupPage
Fetching search data from http://localhost:8123/search/wikidb/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C
7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10
total [0] hits
OutputPage::sendCacheControl: private caching;  **
Request ended normally

Now get this, if I goto the link from the debug from above = http://localhost:8123/search/wikidb/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10 , I get this page;;

3
1.0 0 Main_Page
0.9577699303627014 0 EFFICIENT%2FCISCO%2FNETSCREEN%2FNETOPIA_Router_Command_Matrix
0.7121278643608093 0 DBU_-_DialBackUp

Which leads me to my question: what am I doing wrong?? I have tried everything I can think of, I just cannot get my search within my mediawiki to work proplery. It seems like the search itself is working when going to the link directly above -- somehow the "total hits" in the log as well as the wiki are showing ZERO? Yet manually going to the link in the debug, shows me what appears to be a result indicating 3 PAGES were found with corresponding results data!?@# Why is MediaWiki not showing this? Anyhelp would be kindly appreciated, or even a link for reference! -peace- --Agentdcooper 12 May 2008

I would suspect the problem is the MW version. Search front-end has been heavily refactored in MediaWiki 1.13, and MWSearch is designed to run with latest mediawiki, so there might be some compatibility issues. Note that MW 1.13 is still not released, but is still in development. Try using Extension:LuceneSearch instead. --Rainman 13:20, 12 May 2008 (UTC)Reply
Thanks a TON, I will try this out in just a few, I half suspected it was a MediaWiki versioning issue, I really need to upgrade! =) --Agentdcooper 20:16, 12 May 2008 (UTC)Reply
I moved to LuceneSearch and getting a strange error -- I removed MWSearch extension entirely, then downloaded Extension:LuceneSearch SVN from today, and moved the LuceneSeach directory to /var/www/htdocs/wiki - chmod'd to 755 recursively to make sure it isn't a permissions issue - the I commented out the MWSearch code in LocalSettings.php;
#$wgSearchType = 'LuceneSearch'; 
#$wgLuceneHost = 'localhost';
#$wgLucenePort = 8123;
#require_once("extensions/MWSearch/MWSearch.php");
I've tried different settings for Extension:LuceneSearch, but ended up with this config for LuceneSearch ;
$wgDisableInternalSearch = true;
$wgDisableSearchUpdate = true;
$wgSearchType = 'LuceneSearch';
$wgLuceneHost = 'localhost';
$wgLucenePort = 8123;
require_once("extensions/LuceneSearch/LuceneSearch.php");
$wgLuceneSearchVersion = 2;
$wgLuceneDisableSuggestions = true;
$wgLuceneDisableTitleMatches = true;

I then ran the indexer, which seemed to go great ;

> php maintenance/dumpBackupInit.php --current --quiet > wikidb.xml && java -cp /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s /var/www/htdocs/wiki/wikidb.xml wikidb
MediaWiki Lucene search indexer - index builder from xml database dumps.

Trying config file at path /root/.lsearch.conf
Trying config file at path /var/www/htdocs/wiki/lsearch.conf
Trying config file at path /etc/lsearch.conf
log4j: Trying to find [log4j.xml] using context classloader sun.misc.Launcher$AppClassLoader@133056f.
log4j: Trying to find [log4j.xml] using sun.misc.Launcher$AppClassLoader@133056f class loader.
log4j: Trying to find [log4j.xml] using ClassLoader.getSystemResource().
log4j: Trying to find [log4j.properties] using context classloader sun.misc.Launcher$AppClassLoader@133056f.
log4j: Trying to find [log4j.properties] using sun.misc.Launcher$AppClassLoader@133056f class loader.
log4j: Trying to find [log4j.properties] using ClassLoader.getSystemResource().
log4j: Could not find resource: [null].
log4j: Parsing for [root] with value=[INFO, A1].
log4j: Level token is [INFO].
log4j: Category root set to INFO
log4j: Parsing appender named "A1".
log4j: Parsing layout options for "A1".
log4j: Setting property [conversionPattern] to [%-4r [%t] %-5p %c %x - %m%n].
log4j: End of parsing for "A1".
log4j: Parsed "A1" options.
log4j: Finished configuring.
0    [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer
17   [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En
432  [main] INFO  org.wikimedia.lsearch.ranks.RankBuilder  - First pass, getting a list of valid articles...
94 pages (98.739/sec), 94 revs (98.739/sec)
1532 [main] INFO  org.wikimedia.lsearch.ranks.RankBuilder  - Second pass, calculating article links...
94 pages (325.26/sec), 94 revs (325.26/sec)
1934 [main] INFO  org.wikimedia.lsearch.importer.Importer  - Third pass, indexing articles...
94 pages (24.691/sec), 94 revs (24.691/sec)
5996 [main] INFO  org.wikimedia.lsearch.importer.Importer  - Closing/optimizing index...
Finished indexing in 5s, with final index optimization in 0s
Total time: 6s
6515 [main] INFO  org.wikimedia.lsearch.index.IndexThread  - Making snapshot for wikidb
6566 [main] INFO  org.wikimedia.lsearch.index.IndexThread  - Made snapshot /usr/local/search/indexes/snapshot/wikidb/20080512134828

And then, started lsearch daemon via console ;

> java -Djava.rmi.server.codebase=file:///usr/local/search/lucene-search-2svn05112008/LuceneSeach.jar -Djava.rmi.server.hostname=nen-tftp -jar /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar $*
RMI registry started.
Trying config file at path /root/.lsearch.conf
Trying config file at path /root/lsearch.conf
Trying config file at path /etc/lsearch.conf
log4j: Parsing for [root] with value=[INFO, A1].
log4j: Level token is [INFO].
log4j: Category root set to INFO
log4j: Parsing appender named "A1".
log4j: Parsing layout options for "A1".
log4j: Setting property [conversionPattern] to [%-4r [%t] %-5p %c %x - %m%n].
log4j: End of parsing for "A1".
log4j: Parsed "A1" options.
log4j: Finished configuring.
0    [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En
2353 [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer
2603 [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound
2885 [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RemoteSearchable<wikidb> bound
2929 [Thread-2] INFO  org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8321
2930 [Thread-3] INFO  org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123
2935 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ...
4265 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 1329 ms
4266 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ...
5110 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 844 ms
5110 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ...
5922 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 811 ms

My mediawiki's Special:Version page shows LuceneSearch (version 2.0) is installed properly. Yet, when I do any type of search in my MediaWiki, the page comes up displaying the following error;

Fatal error: Call to undefined function wfLoadExtensionMessages() in /var/www/htdocs/wiki/extensions/LuceneSearch/LuceneSearch_body.php on line 85

The lsearch daemon console output shows nothing, new since I started it! That to me indicates; the search isn't being passed to the lsearch daemon?? ... In reviewing the Debug log @ /var/log/mediawiki/debug_log.txt, I'm seeing this ;;

Start request
GET /wiki/index.php/Special:Search?search=wiki&fulltext=Search
Host: nen-tftp.techiekb.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.14) Gecko/2
0080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plai
n;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://nen-tftp.techiekb.com/wiki/index.php/Main_Page
Cookie: wikidbUserName=Rprior; wikidb_session=buvigq1obd1nd5ulbk1l8d83s7; wikidb
UserID=2; wikidbToken=dd6c9b732dba0c94b04ad72044d46d79
Authorization: Basic ZGNvb3Blcjp0ZXN0cGFzcw==

Main cache: FakeMemCachedClient
Message cache: MediaWikiBagOStuff
Parser cache: MediaWikiBagOStuff
Unstubbing $wgParser on call of $wgParser->setHook from require_once
Fully initialised
Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebReques
t::getGPCVal
Language::loadLocalisation(): got localisation for en from source
Unstubbing $wgUser on call of $wgUser->isAllowed from Title::userCanRead
Cache miss for user 2
Unstubbing $wgLoadBalancer on call of $wgLoadBalancer->getConnection from wfGetDB
Logged in from session

Just as an added note here, the file /var/www/htdocs/wiki/extensions/LuceneSearch/LuceneSearch_body.php includes the following on line #85 thru #89 ;

                wfLoadExtensionMessages( 'LuceneSearch' );
                $fname = 'LuceneSearch::execute';
                wfProfileIn( $fname );
                $this->setHeaders();
                $wgOut->addHTML('<!-- titlens = '. $wgTitle->getNamespace() . '-
->');

Any chance you got an idea on how to fix this issue? =) --- I am thinking I may just have to update to mediawiki SVN and try MWSearch if I cannot get this going on my current mediawiki install, yet I'd LOVE to fix this if possible. Please help me! =) --Agentdcooper 21:12, 12 May 2008 (UTC)Reply

2008-05-12 :: Installed Mediawiki SVN + Lucene-Search SVN & MWSearch SVN, still getting ZERO search results[edit]

I flat-out installed MW from new version of mediawiki SVN, Lucene-search SVN, and MWSearch SVN Version r34306 -- all subversion/SVN downloads from 05.12.2008, with lucene-search-2 SVN being 05.11.2008).

My config files
/etc/lsearch.conf

MWConfig.global=file:///etc/lsearch-global.conf
MWConfig.lib=/usr/local/search/lucene-search-2svn05112008/lib
Indexes.path=/usr/local/search/indexes
Search.updateinterval=1
Search.updatedelay=0
Search.checkinterval=30
Index.snapshotinterval=5
Index.maxqueuecount=5000
Index.maxqueuetimeout=12
Storage.master=localhost
Storage.username=newwikiuser
Storage.password=testpass
Storage.useSeparateDBs=false
Storage.defaultDB=wikidbnew
Storage.lib=/usr/local/search/lucene-search-2svn05112008/sql
SearcherPool.size=3
Localization.url=file:///var/www/htdocs/wiki-test/languages/messages
Logging.logconfig=/etc/lsearch.log4j
Logging.debug=true

/etc/lsearch-global.conf

[Database]
wikidbnew : (single) (language,en) (warmup,10)
[Index]
nen-tftp : wikidbnew
[Index-Path]
<default> : /usr/local/search/indexes
[OAI]
wiktionary : http://$lang.wiktionary.org/w/index.php
wikilucene : http://localhost/wiki-lucene/phase3/index.php
<default> : http://$lang.wikipedia.org/w/index.php
[Properties]
Database.suffix=wiki wiktionary wikidbnew
KeywordScoring.suffix=wikidbnew wiki wikilucene wikidev
ExactCase.suffix=wikidbnew wiktionary wikilucene
[Namespace-Prefix]
all : <all>
[0] : 0
[1] : 1
[2] : 2
[3] : 3
[4] : 4
[5] : 5
[6] : 6
[7] : 7
[8] : 8
[9] : 9
[10] : 10
[11] : 11
[12] : 12
[13] : 13
[14] : 14
[15] : 15

/etc/lsearch.log4j

log4j.rootLogger=INFO, A1
log4j.appender.A1=org.apache.log4j.ConsoleAppender
log4j.appender.A1.layout=org.apache.log4j.PatternLayout
log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n

command-line for indexing my wiki (now in a script called /var/www/htdocs/wiki-test/dumpBackup.sh)

php maintenance/dumpBackupInit.php --current --quiet > wikidbnew.xml && java -cp /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s /var/www/htdocs/wiki-test/wikidbnew.xml wikidbnew

command-line to start lsearch daemon (now in a script called /usr/local/search/lucene-search-2svn05112008/lsearchd)

java -Djava.rmi.server.codebase=file:///usr/local/search/lucene-search-2svn05112008/LuceneSeach.jar -Djava.rmi.server.hostname=nen-tftp -jar /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar $*

PHP Version 5.2.5 was configured with command line that enabled curl

switches used '--with-curl=shared' '--with-curlwrappers'
cURL support = enabled
cURL Information = libcurl/7.16.2 OpenSSL/0.9.8e zlib/1.2.3 libidn/0.6.10
  • the mySQL DB wikidbnew does show a table called searchindex sized 20.5 KiB, which appears to be populated correctly with search info from my wikidb.

config/install of new mediawiki SVN I ran thru the basic config/install of mediawiki, and put some data into the basic wiki - something I knew could be searchable easily. I build the index, it seems to build without error, everything just works --- but when I issue a search from the main wiki page, i get ZERO search results, even tho' the mediawiki original search DID find these searches when it was just a basic mediawiki install, prior to me installing Lucene-Search and/or MWSearch extensions.

mediawiki search results = ZERO What seems strange here is everything seems to work, up-to the point of searching thru my wiki! when I search in the wiki, i get the following, ZERO results message ;

No page text matches

Note: Only some namespaces are searched by default. Try prefixing your query with all: to search all content (including talk pages, templates, etc), or use the desired namespace as prefix. 

mediawiki debug file When I look at the mediawiki debug file = /var/log/mediawiki/debug_mediawiki-wiki-test_log.txt, it shows the following :: when a search is being submitted for 'wiki' (which exists in multiple locations on the mainpage within the mediawiki) ;;

Start request
GET /wiki-test/index.php?title=Special%3ASearch&search=wiki&ns0=1&ns1=1&ns2=1&ns3=1&ns4=1&ns5=1&ns6=1&ns7=1&ns8=1&ns9=1&ns10=1&ns11=1&ns12=1&ns13=1&ns14=1&ns15=1&fulltext=Search
Host: nen-tftp.techiekb.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://nen-tftp.techiekb.com/wiki-test/index.php/Special:Search?search=wiki&fulltext=Search
Cookie: wikidbUserName=Rprior; wikidb_session=buvigq1obd1nd5ulbk1l8d83s7; wikidbUserID=2; wikidbToken=dd6c9b732dba0c94b04ad72044d46d79; wikidbnew_session=gvchrcs1cf12uvdukl1odpapk7; wikidbnewUserID=1; wikidbnewUserName=Rprior; wikidbnewToken=ef9b27fc68ffacb8c7362b31ea27e292
Authorization: Basic ZGNvb3Blcjp0ZXN0cGFzcw==

Main cache: FakeMemCachedClient
Message cache: MediaWikiBagOStuff
Parser cache: MediaWikiBagOStuff
session_set_cookie_params: "0", "/", "", "", "1"
Fully initialised
Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal
Language::loadLocalisation(): got localisation for en from source
Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHeaders
Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey
Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get
Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject
Cache miss for user 1
Connecting to localhost wikidbnew...
Connected
Logged in from session
MessageCache::load(): got from global cache
Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform
Preprocessor_Hash::preprocessToObj
$1 - {{SITENAME}}
Preprocessor_Hash::preprocessToObj
$1 - {{SITENAME}}
Preprocessor_Hash::preprocessToObj
You searched for '''[[:wiki]]'''
Preprocessor_Hash::preprocessToObj
For more information about searching {{SITENAME}}, see [[{{MediaWiki:Helppage}}|{{int:help}}]].
Preprocessor_Hash::preprocessToObj
Help:Contents
Fetching search data from http://nen-tftp.techiekb.com:8123/search/wikidbnew/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10
Http::request: GET http://nen-tftp.techiekb.com:8123/search/wikidbnew/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10
total [0] hits
Preprocessor_Hash::preprocessToObj
==No page text matches==

Preprocessor_Hash::preprocessToObj
'''Note''': Only some namespaces are searched by default. Try prefixing your query with ''all:'' to search all content (including talk pages, templates, etc), or use the desired namespace as prefix.
Preprocessor_Hash::preprocessToObj
Search in namespaces:<br />$1<br />
Preprocessor_Hash::preprocessToObj

Preprocessor_Hash::preprocessToObj

Preprocessor_Hash::preprocessToObj
Search for $1 $2
Preprocessor_Hash::preprocessToObj
{{SITENAME}} ({{CONTENTLANGUAGE}})
Preprocessor_Hash::preprocessToObj
About {{SITENAME}}
Preprocessor_Hash::preprocessToObj
About {{SITENAME}}
Preprocessor_Hash::preprocessToObj
From {{SITENAME}}
Preprocessor_Hash::preprocessToObj
Search {{SITENAME}}
OutputPage::sendCacheControl: private caching;  **
Request ended normally

pointing a browser at the link in debug file
Here's the deal though, if I goto the link in the debug thru lynx/a browser = "http://nen-tftp.techiekb.com:8123/search/wikidbnew/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offsett=0&limit=100&version=2&iwlimit=10" - I get this output ! ;

1
1.0 0 Main_Page

HELP :: where am I going wrong??
Mediawiki gives me no results, and the debug log file above, shows a total [0] hits, why am I getting zero hits? no matter what I do, I am getting zero hits!? can you see anything wrong I am doing here? please help =) --Agentdcooper 00:41, 13 May 2008 (UTC)Reply

just to note: if I grep the file /var/www/htdocs/wiki-test/wikidbnew.xml for the same word I am searching for, I get MANY hits!? --Agentdcooper 00:51, 13 May 2008 (UTC)Reply
OK then, try adding wfDebug($data); somewhere around line 564 in MWSearch.php. This should print to the MediaWiki debug log the same data you're seeing whey you directly access the search URL. If it doesn't print anything, then something is wrong with your curl. --Rainman 09:06, 13 May 2008 (UTC)Reply
Well, I think you are on to something there! so here's the deal, I put wfDebug($data); on line #565, by itself. I then re-ran the index command, and restarted the lsearch daemon so I could watch the console output via SSH session .... I loaded up the main wiki page, and did a basic search for the word "wiki" here's what happens ;;
After pushing the search button within the wiki, it takes me to a blank page [my browser's address bar shows = "http://<mydomain.com>/wiki-test/index.php/Special:Search?search=wiki&fulltext=Search" yet is completely blank, watching the console output from the lsearch daemon, it shows the following;
629776 [pool-1-thread-5] INFO  org.wikimedia.lsearch.frontend.HttpHandler  - query:/search/wikidbnew/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10 what:search dbname:wikidbnew term:wiki
629780 [pool-1-thread-5] INFO  org.wikimedia.lsearch.search.SearchEngine  - Using NamespaceFilterWrapper wrap: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
629784 [pool-1-thread-5] INFO  org.wikimedia.lsearch.search.SearchEngine  - search wikidbnew: query=[wiki] parsed=[contents:wiki (title:wiki^6.0 stemtitle:wiki^2.0) (alttitle1:wiki^4.0 alttitle2:wiki^4.0 alttitle3:wiki^4.0) (keyword1:wiki^0.02 keyword2:wiki^0.01 keyword3:wiki^0.0066666664 keyword4:wiki^0.0050 keyword5:wiki^0.0039999997)] hit=[1] in 5ms using IndexSearcherMul:1210691193858
my debug log @ /var/log/mediawiki/debug_mediawiki-wiki-test_log.txt scrolls the following by, right when I do that "wiki" search ;;
Start request
GET /wiki-test/index.php/Special:Search?search=wiki&fulltext=Search
Host: nen-tftp.techiekb.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.14) Gecko/2
0080404 Firefox/2.0.0.14
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plai
n;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://nen-tftp.techiekb.com/wiki-test/index.php/Main_Page
Cookie: wikidbUserName=Rprior; wikidb_session=buvigq1obd1nd5ulbk1l8d83s7; wikidb
UserID=2; wikidbToken=dd6c9b732dba0c94b04ad72044d46d79; wikidbnew_session=gvchrc
s1cf12uvdukl1odpapk7; wikidbnewUserID=1; wikidbnewUserName=Rprior; wikidbnewToke
n=ef9b27fc68ffacb8c7362b31ea27e292
Authorization: Basic ZGNvb3Blcjp0ZXN0cGFzcw==

Main cache: FakeMemCachedClient
Message cache: MediaWikiBagOStuff
Parser cache: MediaWikiBagOStuff
session_set_cookie_params: "0", "/", "", "", "1"
Fully initialised
Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebReques
t::getGPCVal
Language::loadLocalisation(): got localisation for en from source
Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHead
ers
Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey
Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get
Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject
Cache miss for user 1
Connecting to localhost wikidbnew...
Connected
Logged in from session
MessageCache::load(): got from global cache
Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::tran
sform
Preprocessor_Hash::preprocessToObj
$1 - {{SITENAME}}
Preprocessor_Hash::preprocessToObj
$1 - {{SITENAME}}
Preprocessor_Hash::preprocessToObj
You searched for '''[[:wiki]]'''
Preprocessor_Hash::preprocessToObj
For more information about searching {{SITENAME}}, see [[{{MediaWiki:Helppage}}|
{{int:help}}]].
Preprocessor_Hash::preprocessToObj
Help:Contents
Fetching search data from http://nen-tftp.techiekb.com:8123/search/wikidbnew/wik
i?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15
&offset=0&limit=100&version=2&iwlimit=10
Http::request: GET http://nen-tftp.techiekb.com:8123/search/wikidbnew/wiki?names
paces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset
=0&limit=100&version=2&iwlimit=10
If I goto that link at the bottom of the debug log, the following is displayed in my browser;;
1
1.0 0 Main_Page
so, what are you thinking boss, is it my CURL install? if that's the case, a new slackware v12.1 just came out, and it appears they updated apache to v2.2.8, PHP to v5.2.6, yet slack v12.1 still is using curl v7.16.2 package, which is the same version I'm running now, but it has been repackaged ... hmmmm ... what do you think rainman?? BTW, thanks a million for your assistance! I really cant wait to get this lucene search functionality working for my mediawiki project! --Agentdcooper 15:33, 13 May 2008 (UTC)Reply
any idea's, anyone? I am stuck... please help. --Agentdcooper 03:38, 15 May 2008 (UTC)Reply
I am going to install slackware v12.1 as a FRESH install on a new computer, and try this all over again, to see if it may be something I messed up along the way, I will report back with my results... In case someone ends up reading the above, and can make a suggestion, I'm all ears! I will be keeping the slackware 12.0 install seperate, and would love to hear from someone on how I might go about fixing it! -peace- --Agentdcooper 20:52, 15 May 2008 (UTC)Reply

currently, i'm updating to newer OS, but is that necessary, REALLY?
I am downloading slackware 12.1 ISO's right now, but it just bewilders me why I would need to have the latest/greatest OS to run mediawiki - as I understood it, mediawiki can run on all sorts of linux based OS's/distributions and doesn't necessarily need to have the best hardware needed to run with... I've detailed my problems heavily above, I am hoping someone can help me, before I get my new, rather large 2.0Gig OS download completed (it'll take a couple days, due to my slow `net connection right now... I'd really like to fix whats broken before updating my entire OS, meh? thanks for all the help so far! --Agentdcooper 03:24, 19 May 2008 (UTC)Reply

Lucene-search wrecks Special:ListUsers[edit]

When using Lucene-search version 2.0.2 (the current version as of this date) under mediawiki 1.10.x, I found that the special page Special:ListUsers stays blank. Turning on error reporting revealed a fatal error:

Fatal error: Class 'ApiQueryGeneratorBase' not found in /srv/www/htdocs/mediawiki/extensions/LuceneSearch/ApiQueryLuceneSearch.php on line 33 

I found that this can be solved by adding the line

require_once($IP.'/includes/api/ApiQueryBase.php');

into the file LuceneSearch_body.php (right below the require statement which is already there).

Lexw 12:38, 17 July 2008 (UTC)Reply

Exception resolution[edit]

If you have an error such as

Exception in thread "main" java.lang.NullPointerException
        at java.io.File.<init>(Unknown Source)
        at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:117)
        at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:204)
        at org.wikimedia.lsearch.importer.SimpleIndexWriter.openIndex(SimpleIndexWriter.java:67)
        at org.wikimedia.lsearch.importer.SimpleIndexWriter.<init>(SimpleIndexWriter.java:49)
        at org.wikimedia.lsearch.importer.DumpImporter.<init>(DumpImporter.java:39)
        at org.wikimedia.lsearch.importer.Importer.main(Importer.java:128)

when running the index creation, it can be because your host name changed (check $HOSTNAME on command line). In that case, update lsearch-global.conf

Darkoneko m'écrire 13:22, 23 July 2008 (UTC)Reply

LuceneSearch is not available anymore?[edit]

LuceneSearch extension was developed for MediaWiki version 1.12 which IS the current version. But the box on the top of the page says it is not to be used with the current version, and the extension is not available in SVN any more. WHY is that? Am I missing something? Oduvan 13:37, 7 August 2008 (UTC)Reply

Seems like someone moved around some extensions. I've updated the link on Extension:LuceneSearch to point to right location. --Rainman 19:49, 7 August 2008 (UTC)Reply

Running multiple lsearch daemons[edit]

Hi, I am setting up a server which hosts several wikis. We want to use the lucene search for some of them so I have to config several lsearch daemons.

Although I change the Search.Port variable in the lsearch.conf file (Search.port=8124), and after starting the first lsearch, the second lsearch daemon complains about the port 8123 is being used.

Log from first lsearch:

452  [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound
493  [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RemoteSearchable<hiflydb> bound
495  [Thread-1] INFO  org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8321
495  [Thread-2] INFO  org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123
497  [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index hiflydb ...

Log from second lsearch:

471  [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound
511  [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RemoteSearchable<sgidb> bound
514  [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index sgidb ...
565  [Thread-1] INFO  org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8322
565  [Thread-2] INFO  org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123
565  [Thread-2] FATAL org.wikimedia.lsearch.frontend.SearchServer  - Error: bind error: Address already in use

What I'm doing wrong?

Thanks for your help. --2 October 2008


Hi,

I ran into the same problem, but I found out, that the SearchServer class does not parse the configuration for Search.Port.

The HTTPIndexServer on the other side parses the configuration for Index.port.

I suggest, that ther should be code like the following in the SearchServer class as well.

[...]
public class HTTPIndexServer extends Thread {
[...]
 int port = config.getInt("Index","port",8321);
[...]

I will try this out this. Hopefully I will post successful results afterwards.

Regards, -- Voglerp 14:21, 20 October 2008 (UTC)Reply

So here are my test results:
I added the following two lines into the SearchServer class

[...]
public class SearchServer extends Thread {
[...]
  org.apache.log4j.Logger log = Logger.getLogger(SearchServer.class);
		
  1 // Read port setting from configfile, if not found set default
  2 port = config.getInt("Search","port",8123);

  log.info("Searcher started on port " + port);
[...]

Now the Searcher listens to the port specified in the configuration or to the default port 8123. But a new problem is, that it is no longer possible to specify the port on the commandline with -port.

Is it possible to change the code that both options will work?

Kind regards, Peter --Voglerp 08:06, 23 October 2008 (UTC)Reply

Error when trying to run lsearch daemon[edit]

Everytime I run lsearchd I get the following error:

 RMI registry started.
     [java] Trying config file at path /root/.lsearch.conf
     [java] Trying config file at path /usr/local/search/ls2-bin/lsearch.conf
     [java] Ignoring a line up to first section heading...
     [java] Ignoring a line up to first section heading...
     [java] Ignoring a line up to first section heading...
     [java] Ignoring a line up to first section heading...
     [java] Ignoring a line up to first section heading...
     [java] Ignoring a line up to first section heading...
     [java] Ignoring a line up to first section heading...
     [java] Ignoring a line up to first section heading...
     [java] Ignoring a line up to first section heading...
     [java] ERROR in GlobalConfiguration: Default path for index absent. Check section [Index-Path].

and this is what the [Index-Path] section of the global config looks like:

# Rsync path where indexes are on hosts, after default value put
# hosts where the location differs
# Syntax: host : <path>
[Index-Path]
<default> : /mwsearch

any suggestions?

--Dgat16 20:28, 12 October 2008 (UTC)Reply

try the Following:

# Path where indexes are on hosts, after default value put hosts where 
# the location differs
[Index-Path]
<default> : /mwsearch
127.0.0.1 : mwsearch2

--Bachenberg 13:15, 27 August 2009 (UTC)Reply

need help for small wiki farm[edit]

I have a small wiki farm with to wikis, mywiki-en and mywiki-de running on the same wiki software and sharing the same mysql database wikidb.

The mysql tables for both wikis are prefixed, with en_ or with de_ respectively.

mywiki-en is in English

mywiki-de is in German.

I know how to make two separate dump files, wikidb_en.dump and wikidb_de.dump by using the commands

export REQUEST_URI=/wiki/en && php /wwd/wiki/maintenance/dumpBackup.php --current --quiet > wikidb_en.xml
export REQUEST_URI=/wiki/de && php /wwd/wiki/maintenance/dumpBackup.php --current --quiet > wikidb_de.xml

My question is: how do I configure Lucene and mwsearch, so that

- for searches in /wiki/de it uses the indexes created from wikidb_de.xml,

- for searches in /wiki/en it uses the indexes created from wikidb_en.xml

I would not desire that hits in the english wiki show up as serch results for queries in wiki/de, and the other way around.

I also need to know how to configure lsearch-global.conf So far I have written there

[Database]
wikidb : (single) (language,en) (warmup,10)

but this is of course not correct: the dabase wikidb contains two wikis, one on German, one in English.

I hope that somebody can help me a bit.

Thank you, Alois 16:06, 29 October 2008 (UTC)Reply

Searching what the user sees or searching what's behind the scenes[edit]

It seems to make no sense to search the unrendered wiki-text rather than the final product. I don't see why wiki comments ( <!-- such as this --> )should be included in the search but the contents of included templates are not. It really should be the other way round.
For those wikis using the semantic media wiki extension, they also find that the results of inline queries are excluded from the search, that also seems like something that needs to change.
Perhaps there is a place for a search that looks behind the scenes. It may be of interest to a wiki-site manager, but for a standard user the search really needs to be of the actual page contents.
Pnelnik 17:41, 28 November 2008 (UTC)Reply

Agreed that it doesn't. However, it is not a matter of if it makes sense or not, but whether it is difficult or easy to do. There is no easy way to reconstruct articles with templates from very large xml dumps, and no advanced way to integrate updates from OAI with templates, queues and such. This is one of those places where the flexibility of MW in one regard (e.g. syntax and caching) make a huge trade-off with other (ability to have a decent search). --Rainman 02:04, 30 November 2008 (UTC)Reply

Lack of sane defaults[edit]

This extension suffers from a lack of sane defaults, which makes setting it up unnecessarily confusing. I will give some examples from the instructions.

  • mwdumper.jar: should be IN subversion. There is no reason to have to checkout the code for the extension and then get another file
  • speaking of subversion, the root should be moved up a level. The root should not be 'lucene-search-2' if you are going to ask them to put that in a parent directory called 'search'. The root should be 'search', and it should already contain the 'indexes' subdirectory. The instructions should then read 'svn co http://svn.wikimedia.org/svnroot/mediawiki/trunk/search /usr/local/search'.
  • MWConfig.global: specifically asks for a "URL", which have a very specific meaning, and gives an example of only a url. That's great for a multi-host configuration, which most mediawiki installations are not. The default path to this file should be /usr/local/search/ls2/lsearch-global.conf. If this is not an acceptable path, you should say so. The file:/// prefix that is used in these wiki instructions is not what people expect to see.
  • MWConfig.lib: Here you use a standard path, which people normally expect. But this is NOT what they expect since you have told them to use 'file:///' in the previous instructions on the wiki (but not in the configuration file). This is confusing!!!!
  • Localization.url: Back to the file:/// prefix. AGHHHH. There is no need to specify that it is a file. File paths are unambiguous without file:///.
  • Logging.logconfig: There is no reason to prompt the user for the location of this file if you put it in the ls2 directory by default, and make that the default location.

I believe that, up to this point, every single configuration step could have been avoided if there had been sane defaults in place. I don't have the energy to do the rest. --Alterego 18:47, 5 January 2009 (UTC)Reply

I agree that the configuration is overly complicated, that is why the devel branch has a one-step script that will generate and connect all of the configuration in single-host installs. As for url/local file distinction, it follows a simple rule: everything that is global and shared across the search cluster (e.g. global config and MW files) is url, everything local (e.g. local config, indexes path, local log4j config and library files...) is a local path, although that is probably not obvious from the variable names... --Rainman 19:20, 5 January 2009 (UTC)Reply

LSEARCH Daemon init script for SUSE[edit]

  • from Pierre Boisvert.
  • this is our init script for the daemon. It is simple but work for us, so it coult help others as well.
# chkconfig: 2345 80 20
# description: Apache Lucene is a high-performance, full-featured text \
#              search engine library written entirely in Java
# processname: lsearchd
# config: /etc/lsearch.conf
# pidfile: /var/run/lsearchd.pid

# Source function library.
. /etc/rc.status

JAVA=/usr/bin/java
PROG=lsearchd
BASEDIR=/usr/local/bin/ls2-bin
LOG_FILE=/var/log/lsearchd.log
PID_FILE=/var/run/lsearchd.pid
PROG_BIN="$JAVA -Djava.rmi.server.codebase=file://$BASEDIR/LuceneSearch.jar -Djava.rmi.server.hostname=$HOSTNAME -jar $BASEDIR/LuceneSearch.jar"
CHECK_PROC=`ps -ef | grep $JAVA | grep -v grep | wc -l`

rc_reset

start() {
    echo -n $"Starting $PROG: "
    if [ ! -f $PID_FILE ]
    then
        $PROG_BIN >$LOG_FILE $* 2>&1 &  echo $! > $PID_FILE
    else
        if [ $CHECK_PROC -gt 0 ]
        then
                echo "The LSEARCHD Daemon already started"
                rc_failed
        else
                echo "Removing old Pid file..."
                rm $PID_FILE
                $PROG_BIN $*  >LOGFILE  2>&1 &  echo $! > $PID_FILE
        fi
    fi
    rc_status -v

}
stop() {
    echo -n $"Stopping $prog: "
    /sbin/killproc -p $PID_FILE -v $JAVA
    rc_status -v
}
status(){
        echo -n "Checking for Lsearchd daemon "
        checkproc -p $PID_FILE $JAVA
        rc_status -v
}
usage() {
    echo $"Usage: ${prog} {start|stop|restart|reload|status|help"
    exit 1
}

# See how we were called.
case "$1" in
    start)      start;;
    stop)       stop;;
    status)     status;;
    restart)    stop && start;;
    *)          usage;;
esac
rc_exit