Extension talk:Lucene-search/archive/2008

=2008=

Compiling to create lucenesearch.jar failed
I am trying to install the lucene engine for our wiki but the compile of lucene fails.

Ant gives back a lot of error messages during the compilation, errors like:

[javac] /root/lucene/lucene-search/org/wikimedia/lsearch/SearchState.java:331: cannot find symbol [javac] symbol : class Hits [javac] location: class org.wikimedia.lsearch.SearchState [javac]            Hits hits = searcher.search(new TermQuery( [javac]                ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/SearchState.java:331: cannot find symbol [javac] symbol : class TermQuery [javac] location: class org.wikimedia.lsearch.SearchState [javac]            Hits hits = searcher.search(new TermQuery( [javac]                                                ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/SearchState.java:332: cannot find symbol [javac] symbol : class Term [javac] location: class org.wikimedia.lsearch.SearchState [javac]                            new Term("key", key))); [javac]                                    ^ [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 85 errors

Can you help me to solve these error messages or provide a binary?

Many thanks in advance. --Phaidros 12 January 2008


 * Are you compiling with a Sun Java 1.5+ compiler? If so, can you provide the beginning of the error log? --Rainman 00:55, 13 January 2008 (UTC)

Yes, I am using the opensuse 10.3 distribution and javac 1.5.0_13. I hope that the error messages I provide below are enough, sorry for my low experience in java build processes.

I´ll provide the first part and the last messages here: Apache Ant version 1.7.0 compiled on September 22 2007 Buildfile: build.xml Detected Java version: 1.5 in: /usr/lib/jvm/java-1.5.0-sun-1.5.0_update13-sr2/jre Detected OS: Linux parsing buildfile /root/lucene/lucene-search/build.xml with URI = file:/root/lucene/lucene-search/build.xml Project base dir set to: /root/lucene/lucene-search [antlib:org.apache.tools.ant] Could not load definitions from resource org/apache/tools/ant/antlib.xml. It could not be found. [property] Loading /root/lucene-search.build.properties [property] Unable to find property file: /root/lucene-search.build.properties [property] Loading /root/build.properties [property] Unable to find property file: /root/build.properties [property] Loading /root/lucene/lucene-search/build.properties [property] Unable to find property file: /root/lucene/lucene-search/build.properties Property "current.year" has not been set Build sequence for target(s) `default' is [init, compile-core, compile, default] Complete build sequence is [init, compile-core, compile, default, package-tgz-src, jar-core, javadocs, package, package-zip, package-tgz, package-all-binary, dist, package-zip-src, package-all-src, dist-src, dist-all, jar, jar-src, clean, ]

init: [mkdir] Skipping /root/lucene/lucene-search/bin because it already exists. [mkdir] Skipping /root/lucene/lucene-search/dist because it already exists.

compile-core: [mkdir] Skipping /root/lucene/lucene-search/bin because it already exists. [javac] wikimedia/lsearch/Article.java added as wikimedia/lsearch/Article.class doesn't exist. [javac] wikimedia/lsearch/ArticleList.java added as wikimedia/lsearch/ArticleList.class doesn't exist. [javac] wikimedia/lsearch/Configuration.java added as wikimedia/lsearch/Configuration.class doesn't exist. [javac] wikimedia/lsearch/DatabaseConnection.java added as wikimedia/lsearch/DatabaseConnection.class doesn't exist. [javac] wikimedia/lsearch/EnglishAnalyzer.java added as wikimedia/lsearch/EnglishAnalyzer.class doesn't exist. [javac] wikimedia/lsearch/EsperantoAnalyzer.java added as wikimedia/lsearch/EsperantoAnalyzer.class doesn't exist. [javac] wikimedia/lsearch/EsperantoStemFilter.java added as wikimedia/lsearch/EsperantoStemFilter.class doesn't exist. [javac] wikimedia/lsearch/MWDaemon.java added as wikimedia/lsearch/MWDaemon.class doesn't exist. [javac] wikimedia/lsearch/MWSearch.java added as wikimedia/lsearch/MWSearch.class doesn't exist. [javac] wikimedia/lsearch/NamespaceFilter.java added as wikimedia/lsearch/NamespaceFilter.class doesn't exist. [javac] wikimedia/lsearch/QueryStringMap.java added as wikimedia/lsearch/QueryStringMap.class doesn't exist. [javac] wikimedia/lsearch/SearchClientReader.java added as wikimedia/lsearch/SearchClientReader.class doesn't exist. [javac] wikimedia/lsearch/SearchDbException.java added as wikimedia/lsearch/SearchDbException.class doesn't exist. [javac] wikimedia/lsearch/SearchState.java added as wikimedia/lsearch/SearchState.class doesn't exist. [javac] wikimedia/lsearch/Title.java added as wikimedia/lsearch/Title.class doesn't exist. [javac] wikimedia/lsearch/TitlePrefixMatcher.java added as wikimedia/lsearch/TitlePrefixMatcher.class doesn't exist. [javac] Compiling 16 source files to /root/lucene/lucene-search/bin [javac] Using modern compiler dropping /root/lucene/lucene-search/bin/bin from path as it doesn't exist [javac] Compilation arguments: [javac] '-deprecation' [javac] '-d' [javac] '/root/lucene/lucene-search/bin' [javac] '-classpath' [javac] '/root/lucene/lucene-search/bin:/usr/share/java/ant.jar:/usr/share/java/ant-launcher.jar:/usr/share/java/jaxp_parser_impl.jar:/usr/share/java/xml-commons-apis.jar:/usr/share/java/ant/ant-antlr.jar:/usr/share/java/bcel.jar:/usr/share/java/ant/ant-apache-bcel.jar:/usr/share/java/bsf.jar:/usr/share/java/ant/ant-apache-bsf.jar:/usr/share/java/log4j.jar:/usr/share/java/ant/ant-apache-log4j.jar:/usr/share/java/oro.jar:/usr/share/java/ant/ant-apache-oro.jar:/usr/share/java/regexp.jar:/usr/share/java/ant/ant-apache-regexp.jar:/usr/share/java/xml-commons-resolver.jar:/usr/share/java/ant/ant-apache-resolver.jar:/usr/share/java/jakarta-commons-logging.jar:/usr/share/java/ant/ant-commons-logging.jar:/usr/share/java/javamail.jar:/usr/share/java/jaf.jar:/usr/share/java/ant/ant-javamail.jar:/usr/share/java/jdepend.jar:/usr/share/java/ant/ant-jdepend.jar:/usr/share/java/ant/ant-jmf.jar:/usr/share/java/junit.jar:/usr/share/java/ant/ant-junit.jar:/usr/share/java/ant/ant-nodeps.jar:/usr/lib/jvm/java/lib/tools.jar:/usr/share/ant/lib/ant-apache-resolver-1.7.0.jar:/usr/share/ant/lib/ant-apache-bsf.jar:/usr/share/ant/lib/ant-nodeps.jar:/usr/share/ant/lib/ant-commons-logging.jar:/usr/share/ant/lib/ant-junit.jar:/usr/share/ant/lib/ant-javamail-1.7.0.jar:/usr/share/ant/lib/ant-junit-1.7.0.jar:/usr/share/ant/lib/ant-launcher.jar:/usr/share/ant/lib/ant-apache-log4j.jar:/usr/share/ant/lib/ant-apache-oro-1.7.0.jar:/usr/share/ant/lib/ant-javamail.jar:/usr/share/ant/lib/ant-apache-log4j-1.7.0.jar:/usr/share/ant/lib/ant-apache-bcel-1.7.0.jar:/usr/share/ant/lib/ant-nodeps-1.7.0.jar:/usr/share/ant/lib/ant-jmf.jar:/usr/share/ant/lib/ant-jmf-1.7.0.jar:/usr/share/ant/lib/ant-commons-logging-1.7.0.jar:/usr/share/ant/lib/ant-jdepend-1.7.0.jar:/usr/share/ant/lib/ant-1.7.0.jar:/usr/share/ant/lib/ant-apache-regexp.jar:/usr/share/ant/lib/ant-apache-oro.jar:/usr/share/ant/lib/ant-apache-resolver.jar:/usr/share/ant/lib/ant-jdepend.jar:/usr/share/ant/lib/ant-antlr.jar:/usr/share/ant/lib/ant-antlr-1.7.0.jar:/usr/share/ant/lib/ant-apache-regexp-1.7.0.jar:/usr/share/ant/lib/ant-apache-bcel.jar:/usr/share/ant/lib/ant-apache-bsf-1.7.0.jar:/usr/share/ant/lib/ant-launcher-1.7.0.jar:/usr/share/ant/lib/ant.jar' [javac] '-sourcepath' [javac] '/root/lucene/lucene-search/org' [javac] '-encoding' [javac] 'utf-8' [javac] '-g' [javac] [javac] The ' characters around the executable and arguments are [javac] not part of the command. [javac] Files to be compiled: [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/Article.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/ArticleList.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/Configuration.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/DatabaseConnection.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/MWDaemon.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/MWSearch.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/NamespaceFilter.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/QueryStringMap.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/SearchClientReader.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/SearchDbException.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/SearchState.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/Title.java [javac]    /root/lucene/lucene-search/org/wikimedia/lsearch/TitlePrefixMatcher.java [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:28: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.Analyzer; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:29: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.LowerCaseTokenizer; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:30: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.PorterStemFilter; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:31: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.TokenStream; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:37: cannot find symbol [javac] symbol: class Analyzer [javac] public class EnglishAnalyzer extends Analyzer { [javac]                                     ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EnglishAnalyzer.java:38: cannot find symbol [javac] symbol : class TokenStream [javac] location: class org.wikimedia.lsearch.EnglishAnalyzer [javac]    public final TokenStream tokenStream(String fieldName, Reader reader) { [javac]                     ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:31: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.Analyzer; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:32: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.LowerCaseTokenizer; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:33: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.Token; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:34: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.TokenStream; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:36: cannot find symbol [javac] symbol: class Analyzer [javac] public class EsperantoAnalyzer extends Analyzer{ [javac]                                       ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoAnalyzer.java:37: cannot find symbol [javac] symbol : class TokenStream [javac] location: class org.wikimedia.lsearch.EsperantoAnalyzer [javac]    public final TokenStream tokenStream(String fieldName, Reader reader) { [javac]                     ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java:31: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.Token; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java:32: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.TokenStream; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java:33: package org.apache.lucene.analysis does not exist [javac] import org.apache.lucene.analysis.TokenFilter; [javac]                                  ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java:36: cannot find symbol [javac] symbol: class TokenFilter [javac] public class EsperantoStemFilter extends TokenFilter { [javac]                                         ^ [javac] /root/lucene/lucene-search/org/wikimedia/lsearch/EsperantoStemFilter.java:37: cannot find symbol [javac] symbol : class TokenStream [javac] location: class org.wikimedia.lsearch.EsperantoStemFilter [javac]    public EsperantoStemFilter(TokenStream tokenizer) {

--- snipp --- cutted some lines here --- snipp ---

[javac] /root/lucene/lucene-search/org/wikimedia/lsearch/SearchState.java:332: cannot find symbol [javac] symbol : class Term [javac] location: class org.wikimedia.lsearch.SearchState [javac]                            new Term("key", key))); [javac]                                    ^ [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 85 errors

BUILD FAILED /root/lucene/lucene-search/build.xml:55: Compile failed; see the compiler error output for details. at org.apache.tools.ant.taskdefs.Javac.compile(Javac.java:999) at org.apache.tools.ant.taskdefs.Javac.execute(Javac.java:820) at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:288) at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:105) at org.apache.tools.ant.Task.perform(Task.java:348) at org.apache.tools.ant.Target.execute(Target.java:357) at org.apache.tools.ant.Target.performTasks(Target.java:385) at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1329) at org.apache.tools.ant.Project.executeTarget(Project.java:1298) at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41) at org.apache.tools.ant.Project.executeTargets(Project.java:1181) at org.apache.tools.ant.Main.runBuild(Main.java:698) at org.apache.tools.ant.Main.startAnt(Main.java:199) at org.apache.tools.ant.launch.Launcher.run(Launcher.java:257) at org.apache.tools.ant.launch.Launcher.main(Launcher.java:104) --Phaidros 24 January 2008
 * Looks like your ant is broken and cannot find the relevant libraries. I've compiled the package and put it here.--Rainman 11:01, 26 January 2008 (UTC)

Cannot bind RMIMessenger exception: non-JRMP server at remote endpoint
Hello everyone,

I'm quite new in Lucene stuff and I have a problem. I can't get Lucene Java working on one of my server. I've setup it on another server for Mediawiki and it works fine.

It's a GNU/Linux Ubuntu Edgy i686 with kernel 2.6.17-11-server running Apache 2.0 with PHP5 for Mediawiki, some others stuffs like Tomcat & Jboss. Got Java installed : j2re1.4, j2sdk1.4, java-common, libgcj-common, sun-java5-bin, sun-java5-demo , sun-java5-jdk and sun-java5-jre

In the case of the first server (fresh Ubuntu Gutsy 64bits with almost anything running) it worked fine, I can use Lucene to search into my Wiki. In the case of my second server, here is the error when I would like to start the engine :

But NOTHING use the port 8321. I've tried to use another port, it's the same problem. Any ideas how to solve this problem please? Here is my contact :

Thanks, LMJ 15 January 2008
 * First verify that jboss, tomcat and lsearchd all run under sun-java5-bin (and not j2re1.4). If this is the case then maybe the RMI registry is colliding with jboss (so try stopping it if you can). If this appears to be the case, then you can either configure jboss not to use the port 1099, or edit RMIRegistry.java to use a different port (replace 1099 there with your port, and provide the port as param to getRegistry calls in RMIRegistry.java and RMIMessengerClient.java). --Rainman 15:05, 15 January 2008 (UTC)

The port is used by Jboss rmiregistry :-/ I need some extra help to change that port. Can we exchange emails about it Rainman? I tried to contact you via your personal page but I just read English & French ;) --16 January 2008
 * Indeed Rainman, thanks for your help! look at this :


 * I've edited  and change to port to 10999. It seems to work better ;) Got another problem but it seems to be lsearch.conf related issue. --22 January 2008

Daemon status
On the German Wikipedia, I am often irritated because changes in content are not reflected immediately by the full text search and – at the moment – I cannot see whether and when the changes have already or will be processed by the daemon. Therefore, I would like to know:


 * whether the daemon processes the changes chronologically so one could be certain that if one's changes were made at time T and the daemon has processed all changes up to T + 1, they will be reflected in the full text search, and
 * whether there is any way to obtain the daemon status (all changes up to T, n articles in queue, etc.) from a current or future Wikipedia installation.

Thanks, Tim Landscheidt 19:52, 7 February 2008 (UTC)


 * The index is updated around 5 am GMT every day on wikimedia projects (when nothing goes wrong which is most of the time). About 1) - yes, it processes the changes chronologically. 2) - this interface is available but only for system admins, for everybody else - just wait till tomorrow for changes to be applied. --Rainman 10:07, 8 February 2008 (UTC)


 * Hmmm. If I search for "Lassithi" (note the double "s") now, I see that changes in de:Panagia i Kera (8 days ago), de:Kritsa (7 days ago), de:Ierapetra (10 days ago), de:Kera Kardiotissa (11 days ago), de:Griechische Toponyme (11 days ago), de:Venezianische Kolonien (9 days ago) and de:Sitia (11 days ago) have not been processed. Is that what you mean by "when nothing goes wrong"? :-) Would it be technically feasible to include the last time a change was successfully worked into the index in the result page, i. e. "All changes until T considered."? Tim Landscheidt 17:24, 8 February 2008 (UTC)
 * Yes, this seems to be a case of "if nothing is broken" :) one of the dewiki search servers (srv21) is broken and stopped updating its index and seems to have a broken logrotate and possibly some other things. We'll fix it when a sysadmin become available. Whenever you see changes not going in for more than a couple of days you should report it. --Rainman 18:07, 8 February 2008 (UTC)
 * Ok, we tracked this down to a hard drive failure on srv21, now one just needs to wait for cache to expire (~12h) and you should get fresh results - thanks for the report! --Rainman 18:57, 8 February 2008 (UTC)
 * Thanks for the information :-). What would be the proper place to report such things in the future? Tim Landscheidt 21:32, 8 February 2008 (UTC)
 * Technical issues are usually reported via IRC channel #wikimedia-tech where all of the sysadmins are. If there's no-one online to fix the problem then you could submit a bug. You could also send me an e-mail via this wiki or leave a message on my talk page, since I'm more-or-less in change of maintaining the search subsystem. --Rainman 21:44, 8 February 2008 (UTC)
 * Okay, I'll keep that in mind. Thanks again, Tim Landscheidt 22:53, 8 February 2008 (UTC)

Query String Syntax
Please document the subset of Lucene query string syntax that has been implemented. -- 216.143.51.66 22:52, 8 February 2008 (UTC)

Error running the Daemon
RMI registry started. Trying config file at path /root/.lsearch.conf Trying config file at path /var/www/vhosts/kidneycancerknol.com/httpdocs/Lucene/ls2-bin/lsearch.conf 0   [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En 530  [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer 602 [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound 619 [main] ERROR org.wikimedia.lsearch.search.SearcherCache  - I/O Error opening index at path /var/www/vhosts/kidneycancerknol.com/httpdocs/Lucene/ls2-bin/indexes/search/kck_wiki : /var/www/vhosts/kidneycancerknol.com/httpdocs/Lucene/ls2-bin/indexes/search/kck_wiki/segments (No such file or directory) 621 [main] ERROR org.wikimedia.lsearch.search.SearcherCache  - I/O Error opening index at path /var/www/vhosts/kidneycancerknol.com/httpdocs/Lucene/ls2-bin/indexes/search/kck_wiki : /var/www/vhosts/kidneycancerknol.com/httpdocs/Lucene/ls2-bin/indexes/search/kck_wiki/segments (No such file or directory) 621 [main] WARN  org.wikimedia.lsearch.search.SearcherCache  - I/O error warming index for kck_wiki 621 [Thread-3] INFO  org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123 623 [Thread-2] INFO  org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8321
 * 1)  . lsearchd

I'm getting this error saying no file or directory. The directory exists, owever I don't know where the "segments" file comes from

I ran this to create the indexes

php maintenance/dumpBackup.php --current --quiet > wikidb.xml && java -cp LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s wikidb.xml kck_wiki

The wikidb.xml file exists in the httpdocs directory

...and then I started the deamon

Am I missing a trick?

Thanks

Andy Andy.thomas 19 February 2008


 * And what is the output from the importer? It should give you a success messages that it created the indexes and successfully made a snapshot. --Rainman 01:30, 20 February 2008 (UTC)

I'm most likely doing something dumb (being a bit of a newbie) but This is what I get when I just run the java -cp LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s wikidb.xml kck_wiki

Exception in thread "main" java.lang.NoClassDefFoundError: org/wikimedia/lsearch/importer/Importer

--Andy 17:00, 20 February 2008 (GMT)


 * The java command you're running assumes that LuceneSearch.jar is in your current directory, the full command would be java -cp /full/path/to/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s wikidb.xml kck_wiki
 * --Rainman 18:04, 20 February 2008 (UTC)

I'm getting further thanks that helped. Sorry - I'm being dumb I know and I apologise for asking you to hand hold me in this way but I now get this

rying config file at path /root/.lsearch.conf Trying config file at path /var/www/vhosts/kidneycancerknol.com/httpdocs/lsearch.conf 0   [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer 3   [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En 60   [main] INFO  org.wikimedia.lsearch.ranks.RankBuilder  - First pass, getting a list of valid articles... 175 [main] FATAL org.wikimedia.lsearch.ranks.RankBuilder  - I/O error reading dump while getting titles from wikidb.xml 175 [main] INFO  org.wikimedia.lsearch.ranks.RankBuilder  - Second pass, calculating article links... 179 [main] FATAL org.wikimedia.lsearch.ranks.RankBuilder  - I/O error reading dump while calculating ranks for from wikidb.xml Exception in thread "main" java.lang.NullPointerException at org.wikimedia.lsearch.importer.Importer.main(Importer.java:114)

Do I need to set the OIA settings in the global config? I've just kept them s the default. --Andy 18:30, 20 February 2008 (GMT)


 * No, you don't need oai.. Seems to me something is wrong with the xml file .. sure would be helpful if exception weren't suppressed :\ unfortunately cannot help you much more than that.. is wikidb.xml a valid xml file? did you give full path to it? --Rainman 01:00, 21 February 2008 (UTC)

Exception in thread "main" java.lang.UnsupportedClassVersionError
Hi I use following configuration:


 * MediaWiki: 1.11.0
 * PHP: 5.2.5 (apache2handler)
 * MySQL: 5.0.51

If I call this:

java -cp /usr/local/search/ls2/ls2-bin/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s basiswikidb.xml basiswiki

I get the error:

Exception in thread "main" java.lang.UnsupportedClassVersionError: org/wikimedia/lsearch/importer/Importer (Unsupported major.minor version 49.0) at java.lang.ClassLoader.defineClass0(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:539) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:123) at java.net.URLClassLoader.defineClass(URLClassLoader.java:251) at java.net.URLClassLoader.access$100(URLClassLoader.java:55) at java.net.URLClassLoader$1.run(URLClassLoader.java:194) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:187) at java.lang.ClassLoader.loadClass(ClassLoader.java:289) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:274) at java.lang.ClassLoader.loadClass(ClassLoader.java:235) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:302)

My Configuration


 * all Files are in /usr/local/search/ls2/
 * MWConfig.global=file:///usr/local/search/ls2/lsearch-global.conf
 * MWConfig.lib=/usr/local/search/ls2/lib
 * Indexes.path=/usr/local/search/indexes
 * Localization.url=file:///opt/lampp/htdocs/basiswiki/languages/messages
 * Logging.logconfig=/usr/local/search/ls2/lsearch.log4j
 * mwdumper.jar => /usr/local/search/ls2/lib
 * lsearch.conf: Storage.lib=/usr/local/search/ls2/sql

lsearch-global.conf

[Database] wikidev : (single) (language,sr) wikilucene : (nssplit,3) (nspart1,[0]) (nspart2,[4,5,12,13]), (nspart3,[]) wikilucene : (language,en) (warmup,10) basiswiki : (single) (language,en) (warmup,10) [Search-Group] : wikilucene wikidev : basiswiki
 * 1) wikilucene : (single) (language,en) (warmup,0)
 * 1) Search groups
 * 2) Index parts of a split index are always taken from the node's group
 * 3) host : db1.part db2.part
 * 4) Mulitple hosts can search multiple dbs (N-N mapping)

Please can you help me?!

85.158.226.1 11:03, 31 March 2008 (UTC)


 * Run java -version. I probably have old java, you need to update to 1.5 or later. --Rainman 11:57, 31 March 2008 (UTC)

MediaWiki+Lucene-Search+MWSearch = ZERO search results ??!@#?!
Can someone please assist me? =)''
 * Slackware 12.0, on i686 Pentium III [Linux 2.6.21.5]
 * MediaWiki: 1.9.1
 * PHP: 5.2.5 (apache2handler)
 * MySQL: 5.0.37
 * MediaWiki Extension(s): MWSearch SVN (05122008), and Lucene-search SVN (05122008), + I downloaded & installed mwdumper.jar into the Lucene2 lib dir.
 * other tools: jre-6u2-i586-1, jdk-1_5_0_09-i586-1, apache-ant-1.7.0-i586-1bj, rsync-2.6.9-i486-1

I've followed the steps per Extension:Lucene-search and Extension:MWSearch pages, to the T - I've gone over and over them several times, I've been to MediaWiki Forums, and the MediaWiki-L mailing list ... please help me! =)

My Local LuceneSearch configuration /etc/lsearch.conf MWConfig.global=file:///etc/lsearch-global.conf MWConfig.lib=/usr/local/search/lucene-search-2svn05112008/lib Indexes.path=/usr/local/search/indexes Search.updateinterval=1 Search.updatedelay=0 Search.checkinterval=30 Index.snapshotinterval=5 Index.maxqueuecount=5000 Index.maxqueuetimeout=12 Storage.master=localhost Storage.username=wikiuser Storage.password=mypass Storage.useSeparateDBs=false Storage.defaultDB=wikidb Storage.lib=/usr/local/search/lucene-search-2svn05112008/sql Localization.url=file:///var/www/htdocs/wiki/languages/messages Logging.logconfig=/etc/lsearch.log4j Logging.debug=true /etc/lsearch-global.conf [Database] wikidb : (single) (language,en) (warmup,10) [Search-Group] nen-tftp : wikidb [Index] nen-tftp : wikidb [Index-Path] : /usr/local/search/indexes [OAI] wiktionary : http://$lang.wiktionary.org/w/index.php wikilucene : http://localhost/wiki-lucene/phase3/index.php : http://$lang.wikipedia.org/w/index.php [Properties] Database.suffix=wiki wiktionary wikidb KeywordScoring.suffix=wikidb wiki wikilucene wikidev ExactCase.suffix=wikidb wiktionary wikilucene [Namespace-Prefix] all : [0] : 0 [1] : 1 [2] : 2 [3] : 3 [4] : 4 [5] : 5 [6] : 6 [7] : 7 [8] : 8 [9] : 9 [10] : 10 [11] : 11 [12] : 12 [13] : 13 [14] : 14 [15] : 15 /etc/lsearch.log4j log4j.rootLogger=INFO, A1 log4j.appender.A1=org.apache.log4j.ConsoleAppender log4j.appender.A1.layout=org.apache.log4j.PatternLayout log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n
 * LuceneSearch SVN Install dir: /usr/local/search/lucene-search-2svn05112008
 * Indexes stored: /usr/local/search/indexes

relevant /var/www/htdocs/wiki/LocalSettings.php settings $wgSearchType = 'LuceneSearch'; $wgLuceneHost = 'localhost'; $wgLucenePort = 8123; require_once("extensions/MWSearch/MWSearch.php");

building the index works running dumpBackup(Init).php > php maintenance/dumpBackupInit.php --current --quiet > wikidb.xml && java -cp /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s /var/www/htdocs/wiki/wikidb.xml wikidb MediaWiki Lucene search indexer - index builder from xml database dumps.

Trying config file at path /root/.lsearch.conf Trying config file at path /var/www/htdocs/wiki/lsearch.conf Trying config file at path /etc/lsearch.conf log4j: Trying to find [log4j.xml] using context classloader sun.misc.Launcher$AppClassLoader@133056f. log4j: Trying to find [log4j.xml] using sun.misc.Launcher$AppClassLoader@133056f class loader. log4j: Trying to find [log4j.xml] using ClassLoader.getSystemResource. log4j: Trying to find [log4j.properties] using context classloader sun.misc.Launcher$AppClassLoader@133056f. log4j: Trying to find [log4j.properties] using sun.misc.Launcher$AppClassLoader@133056f class loader. log4j: Trying to find [log4j.properties] using ClassLoader.getSystemResource. log4j: Could not find resource: [null]. log4j: Parsing for [root] with value=[INFO, A1]. log4j: Level token is [INFO]. log4j: Category root set to INFO log4j: Parsing appender named "A1". log4j: Parsing layout options for "A1". log4j: Setting property [conversionPattern] to [%-4r [%t] %-5p %c %x - %m%n]. log4j: End of parsing for "A1". log4j: Parsed "A1" options. log4j: Finished configuring. 0   [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer 18  [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En 434  [main] INFO  org.wikimedia.lsearch.ranks.RankBuilder  - First pass, getting a list of valid articles... 94 pages (99.576/sec), 94 revs (99.576/sec) 1527 [main] INFO org.wikimedia.lsearch.ranks.RankBuilder  - Second pass, calculating article links... 94 pages (326.389/sec), 94 revs (326.389/sec) 1928 [main] INFO org.wikimedia.lsearch.importer.Importer  - Third pass, indexing articles... 94 pages (24.588/sec), 94 revs (24.588/sec) 6005 [main] INFO org.wikimedia.lsearch.importer.Importer  - Closing/optimizing index... Finished indexing in 5s, with final index optimization in 0s Total time: 6s 6530 [main] INFO org.wikimedia.lsearch.index.IndexThread  - Making snapshot for wikidb 6582 [main] INFO org.wikimedia.lsearch.index.IndexThread  - Made snapshot /usr/local/search/indexes/snapshot/wikidb/20080512024654

That creates a 277KB file @ /var/www/htdocs/wiki/wikidb.xml, which looks just fine to me...

Starting the lsearch daemon is working When I run my script /usr/local/search/lucene-search-2svn05112008/lsearchd - which starts the lsearch deamon, I get the following, which ALSO looks fine ; java -Djava.rmi.server.codebase=file:///usr/local/search/lucene-search-2svn05112008/LuceneSeah.jar -Djava.rmi.server.hostname=nen-tftp -jar /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar $* RMI registry started. Trying config file at path /root/.lsearch.conf Trying config file at path /usr/local/search/lucene-search-2svn05112008/lsearch.conf log4j: Parsing for [root] with value=[INFO, A1]. log4j: Level token is [INFO]. log4j: Category root set to INFO log4j: Parsing appender named "A1". log4j: Parsing layout options for "A1". log4j: Setting property [conversionPattern] to [%-4r [%t] %-5p %c %x - %m%n]. log4j: End of parsing for "A1". log4j: Parsed "A1" options. log4j: Finished configuring. 0   [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En 2351 [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer 2600 [main] INFO org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound 2882 [main] INFO org.wikimedia.lsearch.interoperability.RMIServer  - RemoteSearchable bound 2914 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ... 2928 [Thread-2] INFO org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8321 2929 [Thread-3] INFO org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123 4246 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 1331 ms 4246 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ... 5079 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 833 ms 5079 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ... 5861 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 782 ms From here, I pull up my normal wiki, which has been working fine ALL along - but now, I get ZERO search results, no matter what I do! I know I am searching correctly, I just type in 1 single word for searching (that I know is on several pages in the wiki) I've even tried to edit the file before and after building the index, and starting/stoping the lsearch daemon, yet I get this error in my MediaWiki search results page; Search results From AgentDcooper's Wiki

You searched for wiki

For more information about searching AgentDcooper's Wiki, see Searching AgentDcooper's Wiki.

Showing below 0 results starting with #1. No page text matches

Note: Unsuccessful searches are often caused by searching for common words like "have" and "from", which are not indexed, or by specifying more than one search term (only pages containing all of the search terms will appear in the result).

I notice that the lsearch daemon console output scrolls the following; right after doing a search within the wiki 293744 [pool-2-thread-1] INFO org.wikimedia.lsearch.frontend.HttpHandler  - query:/search/wikidb/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10 what:search dbname:wikidb term:wiki 293759 [pool-2-thread-1] INFO org.wikimedia.lsearch.search.SearchEngine  - Using NamespaceFilterWrapper wrap: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} 293786 [pool-2-thread-1] INFO org.wikimedia.lsearch.search.SearchEngine  - search wikidb: query=[wiki] parsed=[contents:wiki (title:wiki^6.0 stemtitle:wiki^2.0) (alttitle1:wiki^4.0 alttitle2:wiki^4.0 alttitle3:wiki^4.0) (keyword1:wiki^0.02 keyword2:wiki^0.01 keyword3:wiki^0.0066666664 keyword4:wiki^0.0050 keyword5:wiki^0.0039999997)] hit=[27] in 16ms using IndexSearcherMul:1210585609666 With Mediawiki Debuging enabled, my /var/log/mediawiki/debug_log.txt shows this Start request GET /wiki/index.php/Special:Search?search=wiki&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0. 5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki/index.php/Special:Version Cookie: wikidb_session=3jptdli2pf3nkuq924tq1ihlt0 Authorization: Basic ZGNvb3Blcjp0ZXN0cGFzcw==

Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Unstubbing $wgParser on call of $wgParser->setHook from require_once Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgUser on call of $wgUser->isAllowed from Title::userCanRead Cache miss for user 2 Unstubbing $wgLoadBalancer on call of $wgLoadBalancer->getConnection from wfGetDB Logged in from session Unstubbing $wgMessageCache on call of $wgMessageCache->getTransform from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get MessageCache::load: got from global cache Unstubbing $wgOut on call of $wgOut->setPageTitle from SpecialSearch::setupPage Fetching search data from http://localhost:8123/search/wikidb/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C 7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10 total [0] hits OutputPage::sendCacheControl: private caching; ** Request ended normally Now get this, if I goto the link from the debug from above = http://localhost:8123/search/wikidb/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10, I get this page;; 3 1.0 0 Main_Page 0.9577699303627014 0 EFFICIENT%2FCISCO%2FNETSCREEN%2FNETOPIA_Router_Command_Matrix 0.7121278643608093 0 DBU_-_DialBackUp Which leads me to my question: what am I doing wrong?? I have tried everything I can think of, I just cannot get my search within my mediawiki to work proplery. It seems like the search itself is working when going to the link directly above -- somehow the "total hits" in the log as well as the wiki are showing ZERO? Yet manually going to the link in the debug, shows me what appears to be a result indicating 3 PAGES were found with corresponding results data!?@# Why is MediaWiki not showing this? Anyhelp would be kindly appreciated, or even a link for reference! -peace- --Agentdcooper 12 May 2008


 * I would suspect the problem is the MW version. Search front-end has been heavily refactored in MediaWiki 1.13, and MWSearch is designed to run with latest mediawiki, so there might be some compatibility issues. Note that MW 1.13 is still not released, but is still in development. Try using Extension:LuceneSearch instead. --Rainman 13:20, 12 May 2008 (UTC)


 * Thanks a TON, I will try this out in just a few, I half suspected it was a MediaWiki versioning issue, I really need to upgrade! =) --Agentdcooper 20:16, 12 May 2008 (UTC)


 * I moved to LuceneSearch and getting a strange error -- I removed MWSearch extension entirely, then downloaded Extension:LuceneSearch SVN from today, and moved the LuceneSeach directory to /var/www/htdocs/wiki - chmod'd to 755 recursively to make sure it isn't a permissions issue - the I commented out the MWSearch code in LocalSettings.php;

 
 * 1) $wgSearchType = 'LuceneSearch';
 * 2) $wgLuceneHost = 'localhost';
 * 3) $wgLucenePort = 8123;
 * 4) require_once("extensions/MWSearch/MWSearch.php");
 * I've tried different settings for Extension:LuceneSearch, but ended up with this config for LuceneSearch ;

 $wgDisableInternalSearch = true; $wgDisableSearchUpdate = true; $wgSearchType = 'LuceneSearch'; $wgLuceneHost = 'localhost'; $wgLucenePort = 8123; require_once("extensions/LuceneSearch/LuceneSearch.php"); $wgLuceneSearchVersion = 2; $wgLuceneDisableSuggestions = true; $wgLuceneDisableTitleMatches = true;  I then ran the indexer, which seemed to go great ;  > php maintenance/dumpBackupInit.php --current --quiet > wikidb.xml && java -cp /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s /var/www/htdocs/wiki/wikidb.xml wikidb MediaWiki Lucene search indexer - index builder from xml database dumps.

Trying config file at path /root/.lsearch.conf Trying config file at path /var/www/htdocs/wiki/lsearch.conf Trying config file at path /etc/lsearch.conf log4j: Trying to find [log4j.xml] using context classloader sun.misc.Launcher$AppClassLoader@133056f. log4j: Trying to find [log4j.xml] using sun.misc.Launcher$AppClassLoader@133056f class loader. log4j: Trying to find [log4j.xml] using ClassLoader.getSystemResource. log4j: Trying to find [log4j.properties] using context classloader sun.misc.Launcher$AppClassLoader@133056f. log4j: Trying to find [log4j.properties] using sun.misc.Launcher$AppClassLoader@133056f class loader. log4j: Trying to find [log4j.properties] using ClassLoader.getSystemResource. log4j: Could not find resource: [null]. log4j: Parsing for [root] with value=[INFO, A1]. log4j: Level token is [INFO]. log4j: Category root set to INFO log4j: Parsing appender named "A1". log4j: Parsing layout options for "A1". log4j: Setting property [conversionPattern] to [%-4r [%t] %-5p %c %x - %m%n]. log4j: End of parsing for "A1". log4j: Parsed "A1" options. log4j: Finished configuring. 0   [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer 17  [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En 432  [main] INFO  org.wikimedia.lsearch.ranks.RankBuilder  - First pass, getting a list of valid articles... 94 pages (98.739/sec), 94 revs (98.739/sec) 1532 [main] INFO org.wikimedia.lsearch.ranks.RankBuilder  - Second pass, calculating article links... 94 pages (325.26/sec), 94 revs (325.26/sec) 1934 [main] INFO org.wikimedia.lsearch.importer.Importer  - Third pass, indexing articles... 94 pages (24.691/sec), 94 revs (24.691/sec) 5996 [main] INFO org.wikimedia.lsearch.importer.Importer  - Closing/optimizing index... Finished indexing in 5s, with final index optimization in 0s Total time: 6s 6515 [main] INFO org.wikimedia.lsearch.index.IndexThread  - Making snapshot for wikidb 6566 [main] INFO org.wikimedia.lsearch.index.IndexThread  - Made snapshot /usr/local/search/indexes/snapshot/wikidb/20080512134828  And then, started lsearch daemon via console ;  > java -Djava.rmi.server.codebase=file:///usr/local/search/lucene-search-2svn05112008/LuceneSeach.jar -Djava.rmi.server.hostname=nen-tftp -jar /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar $* RMI registry started. Trying config file at path /root/.lsearch.conf Trying config file at path /root/lsearch.conf Trying config file at path /etc/lsearch.conf log4j: Parsing for [root] with value=[INFO, A1]. log4j: Level token is [INFO]. log4j: Category root set to INFO log4j: Parsing appender named "A1". log4j: Parsing layout options for "A1". log4j: Setting property [conversionPattern] to [%-4r [%t] %-5p %c %x - %m%n]. log4j: End of parsing for "A1". log4j: Parsed "A1" options. log4j: Finished configuring. 0   [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En 2353 [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer 2603 [main] INFO org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound 2885 [main] INFO org.wikimedia.lsearch.interoperability.RMIServer  - RemoteSearchable bound 2929 [Thread-2] INFO org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8321 2930 [Thread-3] INFO org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123 2935 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ... 4265 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 1329 ms 4266 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ... 5110 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 844 ms 5110 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index wikidb ... 5922 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up wikidb in 811 ms  My mediawiki's Special:Version page shows LuceneSearch (version 2.0) is installed properly. Yet, when I do any type of search in my MediaWiki, the page comes up displaying the following error; Fatal error: Call to undefined function wfLoadExtensionMessages in /var/www/htdocs/wiki/extensions/LuceneSearch/LuceneSearch_body.php on line 85 The lsearch daemon console output shows nothing, new since I started it! That to me indicates; the search isn't being passed to the lsearch daemon?? ... In reviewing the Debug log @ /var/log/mediawiki/debug_log.txt, I'm seeing this ;;  Start request GET /wiki/index.php/Special:Search?search=wiki&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.14) Gecko/2 0080404 Firefox/2.0.0.14 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plai n;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki/index.php/Main_Page Cookie: wikidbUserName=Rprior; wikidb_session=buvigq1obd1nd5ulbk1l8d83s7; wikidb UserID=2; wikidbToken=dd6c9b732dba0c94b04ad72044d46d79 Authorization: Basic ZGNvb3Blcjp0ZXN0cGFzcw==

Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Unstubbing $wgParser on call of $wgParser->setHook from require_once Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebReques t::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgUser on call of $wgUser->isAllowed from Title::userCanRead Cache miss for user 2 Unstubbing $wgLoadBalancer on call of $wgLoadBalancer->getConnection from wfGetDB Logged in from session  Just as an added note here, the file /var/www/htdocs/wiki/extensions/LuceneSearch/LuceneSearch_body.php includes the following on line #85 thru #89 ;  wfLoadExtensionMessages( 'LuceneSearch' ); $fname = 'LuceneSearch::execute'; wfProfileIn( $fname ); $this->setHeaders; $wgOut->addHTML('getNamespace . '- ->');  Any chance you got an idea on how to fix this issue? =) --- I am thinking I may just have to update to mediawiki SVN and try MWSearch if I cannot get this going on my current mediawiki install, yet I'd LOVE to fix this if possible. Please help me! =) --Agentdcooper 21:12, 12 May 2008 (UTC)

2008-05-12 :: Installed Mediawiki SVN + Lucene-Search SVN & MWSearch SVN, still getting ZERO search results
I flat-out installed MW from new version of mediawiki SVN, Lucene-search SVN, and MWSearch SVN Version r34306 -- all subversion/SVN downloads from 05.12.2008, with lucene-search-2 SVN being 05.11.2008).
 * Base-system: is Slackware 12.0, on i686 Pentium III [Linux 2.6.21.5]
 * Mediawiki 1.13alpha (r34693)
 * PHP: 5.2.5
 * MySQL: 5.0.37
 * packages: jre-6u2-i586-1, jdk-1_5_0_09-i586-1, apache-ant-1.7.0-i586-1bj, rsync-2.6.9-i486-1
 * mwdumper.jar is intalled in /usr/local/search/lucene-search-2svn05112008/lib directory.
 * ExtensionFunctions.php installed @ /var/www/htdocs/wiki-test/extensions
 * Special:Version shows MWSearch (Version r34306) is installed properly...

My config files

/etc/lsearch.conf  MWConfig.global=file:///etc/lsearch-global.conf MWConfig.lib=/usr/local/search/lucene-search-2svn05112008/lib Indexes.path=/usr/local/search/indexes Search.updateinterval=1 Search.updatedelay=0 Search.checkinterval=30 Index.snapshotinterval=5 Index.maxqueuecount=5000 Index.maxqueuetimeout=12 Storage.master=localhost Storage.username=newwikiuser Storage.password=testpass Storage.useSeparateDBs=false Storage.defaultDB=wikidbnew Storage.lib=/usr/local/search/lucene-search-2svn05112008/sql SearcherPool.size=3 Localization.url=file:///var/www/htdocs/wiki-test/languages/messages Logging.logconfig=/etc/lsearch.log4j Logging.debug=true  /etc/lsearch-global.conf  [Database] wikidbnew : (single) (language,en) (warmup,10) [Index] nen-tftp : wikidbnew [Index-Path] : /usr/local/search/indexes [OAI] wiktionary : http://$lang.wiktionary.org/w/index.php wikilucene : http://localhost/wiki-lucene/phase3/index.php : http://$lang.wikipedia.org/w/index.php [Properties] Database.suffix=wiki wiktionary wikidbnew KeywordScoring.suffix=wikidbnew wiki wikilucene wikidev ExactCase.suffix=wikidbnew wiktionary wikilucene [Namespace-Prefix] all : [0] : 0 [1] : 1 [2] : 2 [3] : 3 [4] : 4 [5] : 5 [6] : 6 [7] : 7 [8] : 8 [9] : 9 [10] : 10 [11] : 11 [12] : 12 [13] : 13 [14] : 14 [15] : 15 </PRE> /etc/lsearch.log4j  log4j.rootLogger=INFO, A1 log4j.appender.A1=org.apache.log4j.ConsoleAppender log4j.appender.A1.layout=org.apache.log4j.PatternLayout log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n </PRE> command-line for indexing my wiki (now in a script called /var/www/htdocs/wiki-test/dumpBackup.sh)  php maintenance/dumpBackupInit.php --current --quiet > wikidbnew.xml && java -cp /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s /var/www/htdocs/wiki-test/wikidbnew.xml wikidbnew </PRE> command-line to start lsearch daemon (now in a script called /usr/local/search/lucene-search-2svn05112008/lsearchd)  java -Djava.rmi.server.codebase=file:///usr/local/search/lucene-search-2svn05112008/LuceneSeach.jar -Djava.rmi.server.hostname=nen-tftp -jar /usr/local/search/lucene-search-2svn05112008/LuceneSearch.jar $* </PRE> PHP Version 5.2.5 was configured with command line that enabled curl
 * switches used '--with-curl=shared' '--with-curlwrappers'
 * cURL support = enabled
 * cURL Information = libcurl/7.16.2 OpenSSL/0.9.8e zlib/1.2.3 libidn/0.6.10


 * the mySQL DB wikidbnew does show a table called searchindex sized 20.5 KiB, which appears to be populated correctly with search info from my wikidb.

config/install of new mediawiki SVN I ran thru the basic config/install of mediawiki, and put some data into the basic wiki - something I knew could be searchable easily. I build the index, it seems to build without error, everything just works --- but when I issue a search from the main wiki page, i get ZERO search results, even tho' the mediawiki original search DID find these searches when it was just a basic mediawiki install, prior to me installing Lucene-Search and/or MWSearch extensions.

mediawiki search results = ZERO What seems strange here is everything seems to work, up-to the point of searching thru my wiki! when I search in the wiki, i get the following, ZERO results message ;  No page text matches

Note: Only some namespaces are searched by default. Try prefixing your query with all: to search all content (including talk pages, templates, etc), or use the desired namespace as prefix. </PRE> mediawiki debug file When I look at the mediawiki debug file = /var/log/mediawiki/debug_mediawiki-wiki-test_log.txt, it shows the following :: when a search is being submitted for 'wiki' (which exists in multiple locations on the mainpage within the mediawiki) ;;  Start request GET /wiki-test/index.php?title=Special%3ASearch&search=wiki&ns0=1&ns1=1&ns2=1&ns3=1&ns4=1&ns5=1&ns6=1&ns7=1&ns8=1&ns9=1&ns10=1&ns11=1&ns12=1&ns13=1&ns14=1&ns15=1&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.14) Gecko/20080404 Firefox/2.0.0.14 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki-test/index.php/Special:Search?search=wiki&fulltext=Search Cookie: wikidbUserName=Rprior; wikidb_session=buvigq1obd1nd5ulbk1l8d83s7; wikidbUserID=2; wikidbToken=dd6c9b732dba0c94b04ad72044d46d79; wikidbnew_session=gvchrcs1cf12uvdukl1odpapk7; wikidbnewUserID=1; wikidbnewUserName=Rprior; wikidbnewToken=ef9b27fc68ffacb8c7362b31ea27e292 Authorization: Basic ZGNvb3Blcjp0ZXN0cGFzcw==

Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff session_set_cookie_params: "0", "/", "", "", "1" Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHeaders Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject Cache miss for user 1 Connecting to localhost wikidbnew... Connected Logged in from session MessageCache::load: got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Preprocessor_Hash::preprocessToObj $1 - Preprocessor_Hash::preprocessToObj $1 - Preprocessor_Hash::preprocessToObj You searched for wiki Preprocessor_Hash::preprocessToObj For more information about searching, see |. Preprocessor_Hash::preprocessToObj Help:Contents Fetching search data from http://nen-tftp.techiekb.com:8123/search/wikidbnew/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10 Http::request: GET http://nen-tftp.techiekb.com:8123/search/wikidbnew/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10 total [0] hits Preprocessor_Hash::preprocessToObj

No page text matches
Preprocessor_Hash::preprocessToObj Note: Only some namespaces are searched by default. Try prefixing your query with all: to search all content (including talk pages, templates, etc), or use the desired namespace as prefix. Preprocessor_Hash::preprocessToObj Search in namespaces: $1

Preprocessor_Hash::preprocessToObj

Preprocessor_Hash::preprocessToObj

Preprocessor_Hash::preprocessToObj Search for $1 $2 Preprocessor_Hash::preprocessToObj Preprocessor_Hash::preprocessToObj About Preprocessor_Hash::preprocessToObj About Preprocessor_Hash::preprocessToObj From Preprocessor_Hash::preprocessToObj Search OutputPage::sendCacheControl: private caching; ** Request ended normally </PRE> pointing a browser at the link in debug file

Here's the deal though, if I goto the link in the debug thru lynx/a browser = "http://nen-tftp.techiekb.com:8123/search/wikidbnew/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offsett=0&limit=100&version=2&iwlimit=10" - I get this output ! ;  1 1.0 0 Main_Page </PRE> HELP :: where am I going wrong??

Mediawiki gives me no results, and the debug log file above, shows a total [0] hits, why am I getting zero hits? no matter what I do, I am getting zero hits!? can you see anything wrong I am doing here? please help =) --Agentdcooper 00:41, 13 May 2008 (UTC)
 * just to note: if I grep the file /var/www/htdocs/wiki-test/wikidbnew.xml for the same word I am searching for, I get MANY hits!? --Agentdcooper 00:51, 13 May 2008 (UTC)


 * OK then, try adding wfDebug($data); somewhere around line 564 in MWSearch.php. This should print to the MediaWiki debug log the same data you're seeing whey you directly access the search URL. If it doesn't print anything, then something is wrong with your curl. --Rainman 09:06, 13 May 2008 (UTC)


 * Well, I think you are on to something there! so here's the deal, I put wfDebug($data); on line #565, by itself. I then re-ran the index command, and restarted the lsearch daemon so I could watch the console output via SSH session .... I loaded up the main wiki page, and did a basic search for the word "wiki" here's what happens ;;


 * After pushing the search button within the wiki, it takes me to a blank page [my browser's address bar shows = "http://<mydomain.com>/wiki-test/index.php/Special:Search?search=wiki&fulltext=Search" yet is completely blank, watching the console output from the lsearch daemon, it shows the following;

 629776 [pool-1-thread-5] INFO org.wikimedia.lsearch.frontend.HttpHandler  - query:/search/wikidbnew/wiki?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset=0&limit=100&version=2&iwlimit=10 what:search dbname:wikidbnew term:wiki 629780 [pool-1-thread-5] INFO org.wikimedia.lsearch.search.SearchEngine  - Using NamespaceFilterWrapper wrap: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15} 629784 [pool-1-thread-5] INFO org.wikimedia.lsearch.search.SearchEngine  - search wikidbnew: query=[wiki] parsed=[contents:wiki (title:wiki^6.0 stemtitle:wiki^2.0) (alttitle1:wiki^4.0 alttitle2:wiki^4.0 alttitle3:wiki^4.0) (keyword1:wiki^0.02 keyword2:wiki^0.01 keyword3:wiki^0.0066666664 keyword4:wiki^0.0050 keyword5:wiki^0.0039999997)] hit=[1] in 5ms using IndexSearcherMul:1210691193858 </PRE>
 * my debug log @ /var/log/mediawiki/debug_mediawiki-wiki-test_log.txt scrolls the following by, right when I do that "wiki" search ;;

 Start request GET /wiki-test/index.php/Special:Search?search=wiki&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.14) Gecko/2 0080404 Firefox/2.0.0.14 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plai n;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki-test/index.php/Main_Page Cookie: wikidbUserName=Rprior; wikidb_session=buvigq1obd1nd5ulbk1l8d83s7; wikidb UserID=2; wikidbToken=dd6c9b732dba0c94b04ad72044d46d79; wikidbnew_session=gvchrc s1cf12uvdukl1odpapk7; wikidbnewUserID=1; wikidbnewUserName=Rprior; wikidbnewToke n=ef9b27fc68ffacb8c7362b31ea27e292 Authorization: Basic ZGNvb3Blcjp0ZXN0cGFzcw==

Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff session_set_cookie_params: "0", "/", "", "", "1" Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebReques t::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHead ers Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject Cache miss for user 1 Connecting to localhost wikidbnew... Connected Logged in from session MessageCache::load: got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::tran sform Preprocessor_Hash::preprocessToObj $1 - Preprocessor_Hash::preprocessToObj $1 - Preprocessor_Hash::preprocessToObj You searched for wiki Preprocessor_Hash::preprocessToObj For more information about searching, see |. Preprocessor_Hash::preprocessToObj Help:Contents Fetching search data from http://nen-tftp.techiekb.com:8123/search/wikidbnew/wik i?namespaces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15 &offset=0&limit=100&version=2&iwlimit=10 Http::request: GET http://nen-tftp.techiekb.com:8123/search/wikidbnew/wiki?names paces=0%2C1%2C2%2C3%2C4%2C5%2C6%2C7%2C8%2C9%2C10%2C11%2C12%2C13%2C14%2C15&offset =0&limit=100&version=2&iwlimit=10 </PRE>
 * If I goto that link at the bottom of the debug log, the following is displayed in my browser;;

 1 1.0 0 Main_Page </PRE>
 * so, what are you thinking boss, is it my CURL install? if that's the case, a new slackware v12.1 just came out, and it appears they updated apache to v2.2.8, PHP to v5.2.6, yet slack v12.1 still is using curl v7.16.2 package, which is the same version I'm running now, but it has been repackaged ... hmmmm ... what do you think rainman?? BTW, thanks a million for your assistance! I really cant wait to get this lucene search functionality working for my mediawiki project! --Agentdcooper 15:33, 13 May 2008 (UTC)


 * any idea's, anyone? I am stuck... please help. --Agentdcooper 03:38, 15 May 2008 (UTC)


 * I am going to install slackware v12.1 as a FRESH install on a new computer, and try this all over again, to see if it may be something I messed up along the way, I will report back with my results... In case someone ends up reading the above, and can make a suggestion, I'm all ears! I will be keeping the slackware 12.0 install seperate, and would love to hear from someone on how I might go about fixing it! -peace- --Agentdcooper 20:52, 15 May 2008 (UTC)

currently, i'm updating to newer OS, but is that necessary, REALLY?

I am downloading slackware 12.1 ISO's right now, but it just bewilders me why I would need to have the latest/greatest OS to run mediawiki - as I understood it, mediawiki can run on all sorts of linux based OS's/distributions and doesn't necessarily need to have the best hardware needed to run with... I've detailed my problems heavily above, I am hoping someone can help me, before I get my new, rather large 2.0Gig OS download completed (it'll take a couple days, due to my slow `net connection right now... I'd really like to fix whats broken before updating my entire OS, meh? thanks for all the help so far! --Agentdcooper 03:24, 19 May 2008 (UTC)

Lucene-search wrecks Special:ListUsers
When using Lucene-search version 2.0.2 (the current version as of this date) under mediawiki 1.10.x, I found that the special page Special:ListUsers stays blank. Turning on error reporting revealed a fatal error: Fatal error: Class 'ApiQueryGeneratorBase' not found in /srv/www/htdocs/mediawiki/extensions/LuceneSearch/ApiQueryLuceneSearch.php on line 33 I found that this can be solved by adding the line require_once($IP.'/includes/api/ApiQueryBase.php'); into the file LuceneSearch_body.php (right below the require statement which is already there).

Lexw 12:38, 17 July 2008 (UTC)

Exception resolution
If you have an error such as Exception in thread "main" java.lang.NullPointerException at java.io.File. (Unknown Source) at org.apache.lucene.store.FSDirectory.getDirectory(FSDirectory.java:117) at org.apache.lucene.index.IndexWriter. (IndexWriter.java:204) at org.wikimedia.lsearch.importer.SimpleIndexWriter.openIndex(SimpleIndexWriter.java:67) at org.wikimedia.lsearch.importer.SimpleIndexWriter. (SimpleIndexWriter.java:49) at org.wikimedia.lsearch.importer.DumpImporter. (DumpImporter.java:39) at org.wikimedia.lsearch.importer.Importer.main(Importer.java:128) when running the index creation, it can be because your host name changed (check $HOSTNAME on command line). In that case, update lsearch-global.conf

Darkoneko m'écrire 13:22, 23 July 2008 (UTC)

LuceneSearch is not available anymore?
LuceneSearch extension was developed for MediaWiki version 1.12 which IS the current version. But the box on the top of the page says it is not to be used with the current version, and the extension is not available in SVN any more. WHY is that? Am I missing something? Oduvan 13:37, 7 August 2008 (UTC)
 * Seems like someone moved around some extensions. I've updated the link on Extension:LuceneSearch to point to right location. --Rainman 19:49, 7 August 2008 (UTC)

Running multiple lsearch daemons
Hi, I am setting up a server which hosts several wikis. We want to use the lucene search for some of them so I have to config several lsearch daemons.

Although I change the Search.Port variable in the lsearch.conf file (Search.port=8124), and after starting the first lsearch, the second lsearch daemon complains about the port 8123 is being used.

Log from first lsearch: 452 [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound 493 [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RemoteSearchable bound 495 [Thread-1] INFO  org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8321 495 [Thread-2] INFO  org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123 497 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index hiflydb ...

Log from second lsearch: 471 [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound 511 [main] INFO  org.wikimedia.lsearch.interoperability.RMIServer  - RemoteSearchable bound 514 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index sgidb ... 565 [Thread-1] INFO  org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8322 565 [Thread-2] INFO  org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123 565 [Thread-2] FATAL org.wikimedia.lsearch.frontend.SearchServer  - Error: bind error: Address already in use What I'm doing wrong?

Thanks for your help. --2 October 2008

Hi,

I ran into the same problem, but I found out, that the SearchServer class does not parse the configuration for Search.Port.

The HTTPIndexServer on the other side parses the configuration for Index.port.

I suggest, that ther should be code like the following in the SearchServer class as well. [...] public class HTTPIndexServer extends Thread { [...] int port = config.getInt("Index","port",8321); [...]

I will try this out this. Hopefully I will post successful results afterwards.

Regards, -- Voglerp 14:21, 20 October 2008 (UTC)

So here are my test results:

I added the following two lines  into the SearchServer class [...] public class SearchServer extends Thread { [...] org.apache.log4j.Logger log = Logger.getLogger(SearchServer.class); 1 // Read port setting from configfile, if not found set default 2 port = config.getInt("Search","port",8123);

log.info("Searcher started on port " + port); [...]

Now the Searcher listens to the port specified in the configuration or to the default port 8123. But a new problem is, that it is no longer possible to specify the port on the commandline with -port.

Is it possible to change the code that both options will work?

Kind regards, Peter --Voglerp 08:06, 23 October 2008 (UTC)

Error when trying to run lsearch daemon
Everytime I run lsearchd I get the following error: RMI registry started. [java] Trying config file at path /root/.lsearch.conf [java] Trying config file at path /usr/local/search/ls2-bin/lsearch.conf [java] Ignoring a line up to first section heading...    [java] Ignoring a line up to first section heading...     [java] Ignoring a line up to first section heading...     [java] Ignoring a line up to first section heading...     [java] Ignoring a line up to first section heading...     [java] Ignoring a line up to first section heading...     [java] Ignoring a line up to first section heading...     [java] Ignoring a line up to first section heading...     [java] Ignoring a line up to first section heading...     [java] ERROR in GlobalConfiguration: Default path for index absent. Check section [Index-Path].

and this is what the [Index-Path] section of the global config looks like:

[Index-Path] : /mwsearch
 * 1) Rsync path where indexes are on hosts, after default value put
 * 2) hosts where the location differs
 * 3) Syntax: host :

any suggestions?

--Dgat16 20:28, 12 October 2008 (UTC)

try the Following:

[Index-Path] : /mwsearch 127.0.0.1 : mwsearch2
 * 1) Path where indexes are on hosts, after default value put hosts where
 * 2) the location differs

--Bachenberg 13:15, 27 August 2009 (UTC)

need help for small wiki farm
I have a small wiki farm with to wikis, mywiki-en and mywiki-de running on the same wiki software and sharing the same mysql database wikidb.

The mysql tables for both wikis are prefixed, with en_ or with de_ respectively.

mywiki-en is in English

mywiki-de is in German.

I know how to make two separate dump files, wikidb_en.dump and wikidb_de.dump by using the commands export REQUEST_URI=/wiki/en && php /wwd/wiki/maintenance/dumpBackup.php --current --quiet > wikidb_en.xml export REQUEST_URI=/wiki/de && php /wwd/wiki/maintenance/dumpBackup.php --current --quiet > wikidb_de.xml My question is: how do I configure Lucene and mwsearch, so that

- for searches in /wiki/de it uses the indexes created from wikidb_de.xml,

- for searches in /wiki/en it uses the indexes created from wikidb_en.xml

I would not desire that hits in the english wiki show up as serch results for queries in wiki/de, and the other way around.

I also need to know how to configure lsearch-global.conf So far I have written there [Database] wikidb : (single) (language,en) (warmup,10) but this is of course not correct: the dabase wikidb contains two wikis, one on German, one in English.

I hope that somebody can help me a bit.

Thank you, Alois 16:06, 29 October 2008 (UTC)

Searching what the user sees or searching what's behind the scenes
It seems to make no sense to search the unrendered wiki-text rather than the final product. I don't see why wiki comments should be included in the search but the contents of included templates are not. It really should be the other way round. For those wikis using the semantic media wiki extension, they also find that the results of inline queries are excluded from the search, that also seems like something that needs to change. Perhaps there is a place for a search that looks behind the scenes. It may be of interest to a wiki-site manager, but for a standard user the search really needs to be of the actual page contents. Pnelnik 17:41, 28 November 2008 (UTC)
 * Agreed that it doesn't. However, it is not a matter of if it makes sense or not, but whether it is difficult or easy to do. There is no easy way to reconstruct articles with templates from very large xml dumps, and no advanced way to integrate updates from OAI with templates, queues and such. This is one of those places where the flexibility of MW in one regard (e.g. syntax and caching) make a huge trade-off with other (ability to have a decent search). --Rainman 02:04, 30 November 2008 (UTC)

Lack of sane defaults
This extension suffers from a lack of sane defaults, which makes setting it up unnecessarily confusing. I will give some examples from the instructions.


 * mwdumper.jar: should be IN subversion. There is no reason to have to checkout the code for the extension and then get another file
 * speaking of subversion, the root should be moved up a level. The root should not be 'lucene-search-2' if you are going to ask them to put that in a parent directory called 'search'. The root should be 'search', and it should already contain the 'indexes' subdirectory. The instructions should then read 'svn co http://svn.wikimedia.org/svnroot/mediawiki/trunk/search /usr/local/search'.
 * MWConfig.global: specifically asks for a "URL", which have a very specific meaning, and gives an example of only a url. That's great for a multi-host configuration, which most mediawiki installations are not. The default path to this file should be /usr/local/search/ls2/lsearch-global.conf. If this is not an acceptable path, you should say so. The file:/// prefix that is used in these wiki instructions is not what people expect to see.
 * MWConfig.lib: Here you use a standard path, which people normally expect. But this is NOT what they expect since you have told them to use 'file:///' in the previous instructions on the wiki (but not in the configuration file). This is confusing!!!!
 * Localization.url: Back to the file:/// prefix. AGHHHH. There is no need to specify that it is a file. File paths are unambiguous without file:///.
 * Logging.logconfig: There is no reason to prompt the user for the location of this file if you put it in the ls2 directory by default, and make that the default location.

I believe that, up to this point, every single configuration step could have been avoided if there had been sane defaults in place. I don't have the energy to do the rest. --Alterego 18:47, 5 January 2009 (UTC)


 * I agree that the configuration is overly complicated, that is why the devel branch has a one-step script that will generate and connect all of the configuration in single-host installs. As for url/local file distinction, it follows a simple rule: everything that is global and shared across the search cluster (e.g. global config and MW files) is url, everything local (e.g. local config, indexes path, local log4j config and library files...) is a local path, although that is probably not obvious from the variable names... --Rainman 19:20, 5 January 2009 (UTC)

LSEARCH Daemon init script for SUSE

 * from Pierre Boisvert.
 * this is our init script for the daemon. It is simple but work for us, so it coult help others as well.
 * 1) chkconfig: 2345 80 20
 * 2) description: Apache Lucene is a high-performance, full-featured text \
 * 3)              search engine library written entirely in Java
 * 4) processname: lsearchd
 * 5) config: /etc/lsearch.conf
 * 6) pidfile: /var/run/lsearchd.pid

. /etc/rc.status
 * 1) Source function library.

JAVA=/usr/bin/java PROG=lsearchd BASEDIR=/usr/local/bin/ls2-bin LOG_FILE=/var/log/lsearchd.log PID_FILE=/var/run/lsearchd.pid PROG_BIN="$JAVA -Djava.rmi.server.codebase=file://$BASEDIR/LuceneSearch.jar -Djava.rmi.server.hostname=$HOSTNAME -jar $BASEDIR/LuceneSearch.jar" CHECK_PROC=`ps -ef | grep $JAVA | grep -v grep | wc -l`

rc_reset

start { echo -n $"Starting $PROG: " if [ ! -f $PID_FILE ] then $PROG_BIN >$LOG_FILE $* 2>&1 & echo $! > $PID_FILE else if [ $CHECK_PROC -gt 0 ] then echo "The LSEARCHD Daemon already started" rc_failed else echo "Removing old Pid file..." rm $PID_FILE $PROG_BIN $* >LOGFILE  2>&1 &  echo $! > $PID_FILE fi   fi    rc_status -v

} stop { echo -n $"Stopping $prog: " /sbin/killproc -p $PID_FILE -v $JAVA rc_status -v } status{ echo -n "Checking for Lsearchd daemon " checkproc -p $PID_FILE $JAVA rc_status -v } usage { echo $"Usage: ${prog} {start|stop|restart|reload|status|help" exit 1 }

case "$1" in   start)      start;;    stop)       stop;; status)    status;;    restart)    stop && start;; *)         usage;; esac rc_exit
 * 1) See how we were called.