Extension talk:MWSearch

What are the advantages/disadvantages of MWSearch extension over Lucene extension ? Thank you.

MWSearch is being maintained and developed, while LuceneSearch is not. The difference are minor. MWSearch has a somewhat better and more consistent interface, e.g. it has galleries for Image namespace hits. --Rainman 15:01, 24 April 2008 (UTC)

Does MWSearch support Windows + Apache / Windows + IIS ?
 * No, the search backend does not support windows. --Rainman 13:52, 12 May 2008 (UTC)
 * thats not true; i just managed to compile it under ubuntu and now running it on a windows 2003 server --193.27.220.82 13:55, 28 May 2008 (UTC)

MWSearch over SSL?
Is it possible to use MWSearch over SSL? If so, is there a special configuration needed? (I ask b/c I've been unable to get it working and I'm not permitted to open additional ports)
 * Sorry to answer my own question, but I was able to make this work by editing my host file to point the domain to 127.0.0.1 instead of the external IP. I don't really like this solution so I'm hoping this isn't the preferred method. Does anyone have any suggestions?

MediaWiki_SVN+Lucene-Search2_SVN+MWSearch_SVN = ZERO search results
I've been trying for a very long time now to implement Lucene-search functionality on my MediaWiki site -- I've spent Hours+Days+Weeks+Months troubleshooting this very issue - I'm giving this one last effort, I am hoping SOMEONE will be able to help me, otherwise I am thinking of moving away from MediaWiki entirely (which would make me a very sad panda), and will painstakingly import my MediaWiki into Drupal - then try out their neat sounding-module called Drupal::Search_Attachments module. I have put in over 1 year's worth of work into my MediaWiki-based wiki. I truly need the solution that Extension:Lucene-search+Extension:MWSearch offers, but it has been next to impossible for me to implement on my Slackware server. Who knows, my trouble could very well be something I am doing wrong! I'm human enough to admit that if it turns out to be the case... This is why I come for assistance, I am thinking of posting to the MediaWiki forums too, and linking here (hope that's OK), as I really, really want to get Extension:Lucene-search+Extension:MWSearch working for me! I have been trying to get Extension:Lucene-search+Extension:MWSearch working for over 6months now, and posted alot of my previous issues HERE at the MediaWiki Lucene-Search Talk page, but was unable to find any solutions to my problem. I even tried the newer+more-up-to-date Extension:Lucene-search+Extension:MWSearch setup to no avail. I am BACK, now with a brand new computer (well, it's actually older-hardware, but newly formatted hard-drive), a freshly installed OS, a very basic website with some basic data input to test search functionality This is my current overall system setup; I've gone over and over and OVER the directions per Extension:Lucene-search and Extension:MWSearch pages, I just cannot get this working properly on my box, now that I tried on a new install with new everything, I am convinced this is not a problem on my end, but I could be wrong --- I have documented EVERYTHING I did from install, to now since I've been over this so many times, maybe by me posting my logs here and what I did from begining to end, someone might "see something" I'm missing?? Please HELP! =)
 * Slackware 12.1, on i686 Pentium III (Linux 2.6.24.5-smp = Slackware 12.1's generic-smp-2.6.24.5-smp kernel)
 * MediaWiki: 1.13alpha (SVN 06-25-2008)
 * PHP: 5.2.6 (I used Slackware 12.1's PHP v5.2.6 update package)
 * MySQL: 5.0.51b
 * MediaWiki Extension(s): [Extension:lucene-search|MWSearch]] SVN 06-25-2008, and Lucene-search2 SVN 06-25-2008, + I downloaded & installed mwdumper.jar into the Lucene-search2 "lib" dir = /usr/local/search/ls2
 * other tools: jre-6u6-i586-3, jdk-1_5_0_09-i586-1, apache-ant-1.7.0-i486, rsync-3.0.2-i486-1

SVN install of MediaWiki
> svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/phase3
 * Checked out revision 36630
 * I moved the everything to my htdocs folder using NEW directory = /var/www/htdocs/wiki-svn060108

> chmod a+x /var/www/htdocs/wiki-svn060108/config
 * I ran first time config via http:// /wiki-svn060108/config (configured it as follows) ;;


 * wikiname: NOC Archive
 * contact email: rprior@newedgenetworks.com
 * language: en - english
 * license:  GNU Free Documentation License 1.2 (Wikipedia-compatible)
 * admin username: rprior
 * admin password: xxxxxxxxx
 * wikiDB name: svnwikidb
 * DB username: svnwikiuser
 * DB password: xxxxxxxxx
 * database character set:  Experimental MySQL 4.1/5.0 UTF-8


 * I created the MySQL DB and gave myself permissions

> mysql -u root -p mysql> create database svnwikidb character set utf8; Query OK, 1 row affected (0.00 sec)

mysql> GRANT SELECT,INSERT,UPDATE,DELETE,CREATE,DROP -> ON svnwikidb.* -> TO 'svnwikiuser'@'localhost' -> IDENTIFIED BY 'xxxxxxxx'; Query OK, 0 rows affected (0.03 sec)

mysql> exit


 * moved /var/www/htdocs/wiki-svn060108/config/LocalSettings.php TO /var/www/htdocs/wiki-svn060108/

> chown root:apache LocalSettings.php > chown 700 LocalSettings.php > rm -r /var/www/htdocs/wiki-svn060108/config


 * pulled up my new wiki via the page http://nen-tftp.techiekb.com/wiki-svn06252008/index.php/Main_Page = and IT WORKS!


 * I put some basic data that I knew would be searchable on the front/1st page

Installtion of LuceneSearch2+MWSeach extensions
> cd /var/www/htdocs/wiki-svn06252008/extensions > svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/MWSearch

> ln -s /usr/lib/jdk1.5.0_09/lib/tools.jar /usr/lib/java/lib

> ls -al /usr/lib/java/lib/ lrwxrwxrwx 1 root root       34 Jun 25 03:19 tools.jar -> /usr/lib/jdk1.5.0_09/lib/tools.jar

> cd /tmp > mkdir lucene-search-2 > cd lucene-search-2/ > svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/lucene-search-2/ > mkdir /usr/local/search > mkdir /usr/local/search/ls2 > cd /usr/local/search/ls2 > mv /tmp/lucene-search-2/lucene-search-2/* ./ > cd /usr/local/search/ls2/lib > wgettools/mwdumper.jar > mkdir /usr/local/search/indexes > cd /usr/local/search/ls2

lsearch.conf - my configuration file build
MWConfig.global=file:///etc/lsearch-global.conf MWConfig.lib=/usr/local/search/ls2/lib Indexes.path=/usr/local/search/indexes Search.updateinterval=1 Search.updatedelay=0 Search.checkinterval=30 Index.snapshotinterval=5 Index.maxqueuecount=5000 Index.maxqueuetimeout=12 Storage.master=localhost Storage.useSeparateDBs=false Storage.defaultDB=lsearch Storage.lib=/usr/local/search/ls2/sql SearcherPool.size=3 Localization.url=file:///var/www/htdocs/wiki-svn06252008/languages/messages OAI.username=user OAI.password=pass OAI.maxqueue=5000 Logging.logconfig=/etc/lsearch.log4j Logging.debug=true
 * I created a symlink for /etc/lsearch.conf that points to the actual file = /usr/local/search/ls2/lsearch.conf

ln -s /usr/local/search/ls2/lsearch.conf /etc

/etc/lsearch.log4j - my configuration file build
log4j.rootLogger=INFO, A1 log4j.appender.A1=org.apache.log4j.ConsoleAppender log4j.appender.A1.layout=org.apache.log4j.PatternLayout log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n

/etc/lsearch-global.conf - my configuration file build
[Database] svnwikidb : (single) (language,en) (warmup,10) [Search-Group] nen-tftp : svnwikidb [Index] Database.suffix=wiki wiktionary svnwikidb KeywordScoring.suffix=svnwikidb wiki wikilucene wikidev ExactCase.suffix=svnwikidb wiktionary wikilucene [Namespace-Prefix] all : [0] : 0 [1] : 1 [2] : 2 [3] : 3 [4] : 4 [5] : 5 [6] : 6 [7] : 7 [8] : 8 [9] : 9 [10] : 10 [11] : 11 [12] : 12 [13] : 13 [14] : 14 [15] : 15

built LuceneSearch.jar via ANT
> ln -s /opt/apache-ant/bin/ant /bin > ant Buildfile: build.xml build: [mkdir] Created dir: /usr/local/search/ls2/bin [javac] Compiling 101 source files to /usr/local/search/ls2/bin [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. alljar: [jar] Building jar: /usr/local/search/ls2/LuceneSearch.jar BUILD SUCCESSFUL Total time: 24 seconds

LuceneSearch extension added to LocalSettings.php

 * I added the following to my /var/www/htdocs/wiki-svn060108/LocalSettings.php file ;

$wgSearchType = 'LuceneSearch'; $wgLuceneHost = 'localhost'; $wgLucenePort = 8123; require_once("extensions/MWSearch/MWSearch.php");

created a dumpBackup.sh script to automate building of my index
php /var/www/htdocs/wiki-svn06252008/maintenance/dumpBackupInit.php --current --quiet > wikidb.xml && java -cp /usr/local/search/ls2/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s wikidb.xml svnwikidb

> chmod 750 dumpBackup.sh

created file = dumpBackupInit.php

 * created file /var/www/htdocs/wiki-svn06252008/maintenance/dumpBackupInit.php with 755 permssions ;


 * 1) dumpBackupInit - Wrapper Script to run the mediaWiki xml-dump "dumpBackup.php" correctly
 * 2) $wgDBtype           = "mysql";
 * 3) $wgDBserver         = "localhost";
 * 4) $wgDBname           = "svnwikidb";
 * 5) $wgDBuser           = "svnwikiuser";
 * 6) $wgDBpassword       = "xxxxxxxx";
 * 7) $wgDBprefix         = "";
 * 8)  * $wgDBport           = "5432";
 * 9) @author: Stefan Furcht
 * 10) @version: 1.0
 * 11) @require: /srv/www/htdocs/wiki-svn06252008/maintenance/dumpBackup.php
 * 12) The following Variables musst be set, to get dumpBackup.php at work
 * 1) you'll find this Values in the DB-section into your mediaWiki-Config: LocalSettings.php
 * 2) XML-Dumper 'dumpBackup.php' requires the setted Vars to run
 * 3) simply include the original dumpBackup-Script
 * I then, ran my "dumpBackup.sh" file via command-line

/srv/www/htdocs/wiki-svn06252008/dumpBackup.sh
 * This creates an XML dump of my Wiki DB in a file called wikidb.xml, which seems to work JUST FINE, the file is 3.6Kb, which is pretty small, since I don't have much in my BRAND NEW WIKI, just some text I know will be easily found when the search function is working properly.

starting the lucene-search2 daemon
I start the lucene-search2 daemon using this command-line ; /usr/local/search/lucene-search-2svn05112008/lsearchd
 * The program loads, and spits out some information to the console I am logged into <'some' text follows>

RMI registry started. Trying config file at path /root/.lsearch.conf Trying config file at path /var/www/htdocs/wiki-svn06252008/lsearch.conf Trying config file at path /etc/lsearch.conf log4j: Parsing for [root] with value=[INFO, A1]. log4j: Level token is [INFO]. log4j: Category root set to INFO log4j: Parsing appender named "A1". log4j: Parsing layout options for "A1". log4j: Setting property [conversionPattern] to [%-4r [%t] %-5p %c %x - %m%n]. log4j: End of parsing for "A1". log4j: Parsed "A1" options. log4j: Finished configuring. 0   [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En 2804 [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer 3068 [main] INFO org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound 3351 [main] INFO org.wikimedia.lsearch.interoperability.RMIServer  - RemoteSearchable bound 3374 [Thread-2] INFO org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8321 3386 [Thread-3] INFO org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123 3407 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warming up index svnwikidb ... 4737 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up svnwikidb in 1330 ms 4738 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index svnwikidb ... 5629 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up svnwikidb in 891 ms 5630 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index svnwikidb ... 6203 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up svnwikidb in 573 ms

My MediaWiki Special:Version page

 * My wiki's Special:Version page indicates that PHP, MySQL, and MWSearch are all being properly initialized/recognized by MediaWiki;;

- INSTALLED SOFTWARE - INSTALLED EXTENSIONS
 * MediaWiki 1.13alpha
 * PHP 5.2.6 (apache2handler)
 * MySQL 5.0.51b
 * MWSearch (Version r36482) - MWSearch plugin - Brion Vibber and Kate Turner

The actual PROBLEM is NO SEARCH RESULTS
Now that I have everything setup, and Lucene-search2-deamon running I tried to search on my website... Fingers crossed.... I type in a known word that IS on the front page, and is also in my XML dump of the MySQL DB (wikidb.xml) --- sure enough, I get ZERO SEARCH RESULTS!! I get this error in my MediaWiki search results page; Search results From AgentDcooper's Wiki You searched for wiki For more information about searching AgentDcooper's Wiki, see Searching AgentDcooper's Wiki. Showing below 0 results starting with #1. No page text matches Note: Unsuccessful searches are often caused by searching for common words like "have" and "from", which are not indexed, or by specifying more than one search term (only pages containing all of the search terms will appear in the result).

Troubleshooting the ZERO results issue
Since I have a console session open with lucene-search2 daemon running, I notice that AS SOON as I hit the SEARCH button after typing in my search phrase (loopback) in my MediaWiki search box, the lucene-search2 daemon console output scrolls the following; 893553 [pool-2-thread-1] INFO org.wikimedia.lsearch.frontend.HttpHandler  - query:/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 what:search dbname:svnwikidb term:loopback 893567 [pool-2-thread-1] INFO org.wikimedia.lsearch.search.SearchEngine  - Using NamespaceFilterWrapper wrap: {0} 893592 [pool-2-thread-1] INFO org.wikimedia.lsearch.search.SearchEngine  - search svnwikidb: query=[loopback] parsed=[contents:loopback (title:loopback^6.0 stemtitle:loopback^2.0) (alttitle1:loopback^4.0 alttitle2:loopback^4.0 alttitle3:loopback^4.0) (keyword1:loopback^0.02 keyword2:loopback^0.01 keyword3:loopback^0.0066666664 keyword4:loopback^0.0050 keyword5:loopback^0.0039999997)] hit=[1] in 12ms using IndexSearcherMul:1214736931039


 * I've been troubleshooting this issue for a long time, so I do know how to enable Mediawiki Debuging --- here is what my /var/log/mediawiki/debug_svn_log.txt shows ;;

Start request GET /wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Cookie: wikidbToken=dd6c9b732dba0c94b04ad72044d46d79; wikidbUserName=Rprior; wikidbUserID=2; wikidb_session=5rph1dsoik5dpdlcitc1canlr0; svnwikidb_session=n8btqun31sn6vnubiek79l5br6; svnwikidbUserID=1; svnwikidbUserName=Rprior; svnwikidbToken=baea562c5be4148475a179c94a6868d4 Authorization: Basic ZGNvb3Blcjp0ZXN0cGFzcw==

Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff session_set_cookie_params: "0", "/", "", "", "1" Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHeaders Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject Cache miss for user 1 Connecting to localhost svnwikidb... Connected Logged in from session MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Fetching search data from http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Http::request: GET http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 total [0] hits OutputPage::sendCacheControl: private caching; ** Request ended normally


 * That is the part I am having the most trouble with, IMHO!
 * Follow me here.... Everything actually seems to work, up until the 3rd to last line in the debug! The part that doesn't appear to be working properly is the 3rd from the bottom line =  total [0] hits.
 * This is my reason, WHY I think that is the case, If I actually pull up the web address on my webserver via lynx or any other webbrowser =  http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10  I get the following output!

1 1.0 0 Main_Page


 * That output appears to be saying that THERE IS 1 page that matches!!! The page being Main_Page??!! Does that sound right?? I suspect something MUST be wrong here, it has been pointed out that it may be my CURL library, but that was on an earlier version of Slackware (12.0, I am now running 12.1) - according to my Slackware 12.1 (plain vanilla install, except I upgraded to newer version of PHP) the CURL version/package I am using is curl-7.16.2-i486-1. I don't suspect that CURL is my problem, but I am completely open to anyone's interpretation of my issue at hand, and would love to work with someone on this, and/or come up with a solution... I think it's working, just MWSearch is not passing the data properly to my MediaWiki search???

Does anyone have any ideas here? Please help me, I really don't want to move away from MediaWiki, but I very much need this functionality from MediaWiki! Thanks in advance + sorry to be longwinded, just wanted to ensure I give as much details as possible - if you have any questions, feel free to ask!

PS :: I tried ExtensionFunctions.php SVN-06-25-2008
BTW, in my reading and troubleshooting, I saw something that said I should download the file ExtensionFunctions.php so I pulled this file down from SVN-06-25-2008 ;; > wget http://svn.wikimedia.org /svnroot/mediawiki/trunk/extensions/ExtensionFunctions.php

> mv ExtensionFunctions.php /var/www/htdocs/wiki-svn06252008

This did not resolve my issue at all, still seeing the same problem. ANYONE HAVE ANY IDEAS?
 * If you add wfDebug(print_r($data, true)); in MWSearch_body.php file right after the $data = Http::get( $searchUrl ); line, does that give something useful in your debug log? Or does it give a null? 83.81.5.126 15:35, 29 June 2008 (UTC)
 * Also putting a wfDebug("Raw results [$totalHits]\n"); before $totalHits = intval( $totalHits ); might give useful results 83.81.5.126 15:38, 29 June 2008 (UTC)


 * thanks for the help! so I tried both suggestions, but I obviously don't think I know how to understand the log, but it sure looks like it's put more info in there with your suggestions.


 * First I aded wfDebug(print_r($data, true)); in the extensions/MWSearch/MWSearch_body.php file, right after the $data = Http::get( $searchUrl ); line
 * my debug is now outputting the following upon a search that shows no results in my MediaWiki ;;

Start request GET /wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki-svn06252008/index.php/Main_Page Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHeaders Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject Connecting to localhost svnwikidb... IP: 24.20.24.50 Connected MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Fetching search data from http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Http::request: GET http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 total [0] hits OutputPage::sendCacheControl: private caching; ** Request ended normally Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=rss Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get IP: 24.20.24.50 MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=atom Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source RC: loading feed from cache (svnwikidb:rcfeed:rss:limit:50:minor:; 20080629225416; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Request ended normally Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get IP: 24.20.24.50 MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform RC: loading feed from cache (svnwikidb:rcfeed:atom:limit:50:minor:; 20080629225418; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** Request ended normally
 * I am going to try your 2nd suggestion now, again thanks for helping, it really means alot to me! Agentdcooper


 * OK, this is what my debug log shows when I put wfDebug("Raw results [$totalHits]\n"); before $totalHits = intval( $totalHits ); in the extensions/MWSearch/MWSearch_body.php file while leaving your 1st suggestion in = wfDebug(print_r($data, true)); ;

Start request GET /wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHeaders Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject Connecting to localhost svnwikidb... IP: 24.20.24.50 Connected MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Fetching search data from http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Http::request: GET http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Raw results [] total [0] hits OutputPage::sendCacheControl: private caching; ** Request ended normally

Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=rss Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=atom Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get IP: 24.20.24.50 MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform IP: 24.20.24.50 RC: loading feed from cache (svnwikidb:rcfeed:rss:limit:50:minor:; 20080629225416; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Request ended normally RC: loading feed from cache (svnwikidb:rcfeed:atom:limit:50:minor:; 20080629225418; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** Request ended normally


 * I hope that this helps, to me it sounds looks like the "Raw results" are nadda/zero, why would that be? I would love to know how to correct this if possible! any assistance you can provide is more then helpful! --Agentdcooper 01:02, 30 June 2008 (UTC)
 * I definitely think there is something wrong with HTTP::get. More specifically I think MediaWiki is doing something wrong with proxies... Try to disable CURL by replacing if ( function_exists( 'curl_init' ) ) { in includes/HttpFunctions.php with if ( function_exists( 'curl_init' ) && false) {.
 * The problem is probably due to the fact that you are running Lucene on localhost, while Wikimedia uses Lucene on foreign hosts and there might be a bug with HttpFunctions on localhost, but not on port 80. 83.81.5.126 18:09, 30 June 2008 (UTC)


 * Thank you for sticking around to help (whoever u are!) I am thrilled to see you still here to help me!!! I think you are dead-on, I was thinking it might be something with MWSearch extension, but by the sounds of it u are thinking it's an issue with the way MediaWiki uses HTTP:get? I like where yr going, and I am willing to do anything you suggest to try to isolate this, as I really, really want this to work, more than anything!


 * As u suggested, in file /var/www/htdocs/wiki-svn06252008/includes/HttpFunctions.php I changed line 25 FROM ;

if ( function_exists( 'curl_init' ) ) {
 * TO ;

if ( function_exists( 'curl_init' ) && false) {
 * I then saved the updated file (and made a backup of the original), did another update/import of my wiki DB (php maintenance/dumpBackupInit.php --current --quiet > wikidb.xml && java -cp /usr/local/search/ls2/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s wikidb.xml svnwikidb  which, BTW, works just fine). From here I went to my wiki main page, and ran a search for a word that is on the Main_Page 3 times, and also is in my XML dump of my wiki DB too, 3 times.... The search came up with this error (again, soo sad) ;

You searched for loopback For more information about searching NOC Information Archive, see Help. No page text matches


 * This is what the Lucene-Search-2 daemon (lsearchd) console scrolled by after running my search ;

1384087 [pool-2-thread-6] INFO org.wikimedia.lsearch.frontend.HttpHandler  - query:/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 what:search dbname:svnwikidb term:loopback 1384089 [pool-2-thread-6] INFO org.wikimedia.lsearch.search.SearchEngine  - Using NamespaceFilterWrapper wrap: {0} 1384093 [pool-2-thread-6] INFO org.wikimedia.lsearch.search.SearchEngine  - search svnwikidb: query=[loopback] parsed=[contents:loopback (title:loopback^6.0 stemtitle:loopback^2.0) (alttitle1:loopback^4.0 alttitle2:loopback^4.0 alttitle3:loopback^4.0) (keyword1:loopback^0.02 keyword2:loopback^0.01 keyword3:loopback^0.0066666664 keyword4:loopback^0.0050 keyword5:loopback^0.0039999997)] hit=[1] in 4ms using IndexSearcherMul:1214853636412


 * I left the previous suggestions you gave me yesterday (or the day before?) regarding debug suggestions, so this is what my /var/log/mediawiki/debug_svn_log.txt shows after I ran my search ;

Start request GET /wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki-svn06252008/index.php?title=Special%3ASearch&search=all%3Aloopback&ns0=1&fulltext=Search Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHeaders Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject Connecting to localhost svnwikidb... IP: 24.20.24.50 Connected MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Fetching search data from http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Http::request: GET http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Raw results [] total [0] hits OutputPage::sendCacheControl: private caching; ** Request ended normally Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=rss Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=atom Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey IP: 24.20.24.50 MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get IP: 24.20.24.50 RC: loading feed from cache (svnwikidb:rcfeed:rss:limit:50:minor:; 20080629225416; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Request ended normally RC: loading feed from cache (svnwikidb:rcfeed:atom:limit:50:minor:; 20080629225418; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** Request ended normally


 * So again, I am able to pull up the link listed in the debug file, with the line beginning with "Http::request: GET" = http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 - this pulls up the following (which sure seems like at least THAT part of Lucene search capabilities is working!);

1 1.0 0 Main_Page


 * I really think you are definately onto something here, I am just not sure where to go next here... any more assistance you can give is much appreciated, I think others may run into this same issue, so who knows maybe you helped identify a bug?! thanks again! --Agentdcooper 19:58, 30 June 2008 (UTC)
 * (the ip was me) Ok this is really strange. From this point I'm very sure that something goes wrong with MediaWiki's HTTP fetching capabilities. I just don't know what. Two more ways to test this (you don't need to reinstall anything or something else):


 * Open a terminal and change the directory to your wiki base: cd /var/www/htdocs/wiki-svn06252008
 * Start a PHP debugging session: php maintenance/eval.php
 * See whether you get something with the following command: $sock = fsockopen('127.0.0.1', 8123); fwrite($sock, "GET /search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 HTTP/1.0\r\nHost: localhost\r\n\r\n"); print fread($sock, 8192);
 * If that returns proper results, test whether MediaWiki works properly using: print Http::get('http://127.0.0.1:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10');
 * If that still works try: print Http::get('http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10');
 * I hope we get something out that. Bryan 20:52, 30 June 2008 (UTC)


 * Excellent! That first suggestion seemed to actually give me valid results!;

<(root@nen-tftp:/var/www/htdocs/wiki-svn06252008)> cd /var/www/htdocs/wiki-svn06252008 <(root@nen-tftp:/var/www/htdocs/wiki-svn06252008)> php maintenance/eval.php > $sock = fsockopen('127.0.0.1', 8123); fwrite($sock, "GET /search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 HTTP/1.0\r\nHost: localhost\r\n\r\n"); print fread($sock, 8192); HTTP/1.1 200 OK Content-Type: text/plain 1 1.0 0 Main_Page


 * Now I'm not 100% sure what it is you are asking me to do with the other suggestions regarding the print function, I just tried to issue the print function while still in php maintenance/eval.php context, after I issued the above, I never exited from the same php session/command-line, and continued on with the following, which did not return anything (I just wasn't sure if that was what you meant for me to do...) ;

> print Http::get('http://127.0.0.1:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10'); > print Http::get('http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10'); >


 * If I did that incorrectly, please tell me =)   I, though, think I did what u requested, and it looks like using Http::get with either 127.0.0.1, or just localhost, MediaWiki is reporting back ZERO results =(   Anything else you wanna have me try, again I am totally here for anything you suggest! Thanks much. peace --Agentdcooper 01:44, 1 July 2008 (UTC)
 * Yeah that is what I asked. Ok we can conclude from this that: 1) PHP can connect to Lucene properly and 2) Your HTTP fetch capabilities are broken. I'm not sure what we can do about it. The proper way is of course to fix the HTTP functions, but I don't know how we can do that. The other option is to write a new HTTP layer which will surely work. Bryan 08:45, 1 July 2008 (UTC)


 * Thank you Bryan - you are an tireless Helper and a Saint!! I thank you for your assistance, it looks like you nailed it --- at least PHP is working properly, I wonder if others users out there are running into this issue as well? I've tried multiple computers yet my problem stayed the same (detailed all above) --- the only thing that was similar on my different computers was/is my base-linux distro = Slackware. Yet, I have gone thru brand new installs of each Slackware OS version 11.x thru the recently released stable Slackware v12.1 - I think what you are saying is that it's not my OS that is the problem, but MediaWiki's HTTP layer? I highly doubt I am sophisticated enough to rewrite a new HTTP layer/HTTP functions for MediaWiki, so I may just sadly have to move away from my beloved MediaWiki... I was afraid of this.


 * Again Bryan and everyone else who has helped me, you have been extremely self-less and helpful to me -- I just don't know if I will ever run into someone of your caliber in the Drupal community (even though I know Drupal does have a thriving FS/OSS community) the next time I run into (let's say) a Durpal problem - as your knowledge has been instrumental in helping me understand the problem at hand here. I hope the MediaWiki guru's read this, but I kinda doubt it....


 * Where to go from here, as far as hoping this gets fixed = I have no clue, but I hope between your and my postings here, someone will identify the source issue, and hopefully come out with a fix, or at least a work-around - but I have no clue when that'll be... Thanks MediaWiki team for the good times and all the wiki pages you served for me over the years - it was a long run, but with a heavy heart I must move on! I will continue to watch this page, and hope for a fix, time will tell, I just hope it's not too far away. Peace --Agentdcooper 09:14, 1 July 2008 (UTC)

Answer
Hello, I have tried to read through the bunch of information you gave. My Opinion: Your Lucene is running right, because when you "manually" query your Lucene Listener it returns you "MAIN_PAGE". So my advice is that you don't change the conf of your Lucene. I think the problem is inside your Extension MWSearch. Have you tried to run a fresh 1.12 Installation only with the MWSearch Extension to access your existing Lucene Daemon? I think you need nothing more than your MWSearch Extension and no "ExtensionFunctions". Later on I am trying to get more information about your Problem. I am focusing on your MWSearch Extension.... --Bisato 06:24, 30 June 2008 (UTC)
 * Thanks for your suggestion, but unfortunately I have tried that many times in the past when initially troubleshooting I have alot of info posted HERE. Now that page doesn't indicate that I was running 1.12.0, but I can assure you I have tried it many times with just Extension:MWSearch, I get the exact same results I am posting on this page, and the previous page. I have always obtained these results, using MWSearch no matter what OS version I run (on Slackware 11.x thru current stable version 12.1) I have tried this out on multiple Slackware servers, all obtaining the SAME results. I think yr right, I think Lucene-search2 is working properly, just MWSearch is not passing the proper parameters to Mediawiki. I'd love to troubleshoot and fix this, but I just don't know where to go next. Do you have any other suggestions per chance?
 * Thanks to all who've helped! I hope we can get to the bottom of this! peace --Agentdcooper 10:26, 30 June 2008 (UTC)

Same issue?
Has there been any progress on this issue? I'm encountering what seems to be the same situation. Everything seems configured right and the lsearch daemon is starting up fine, but there are no search results. Here is a sample log for the search phrase "Council": Start request GET /wiki/Special:Search?search=Council&go=Go User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_4; en-us) AppleWebKit/525.18 (KHTML, like Gecko) Version/3.1.2 Safari/525.20.1 Referer: http://www.[mydomain].com/wiki/Main_Page Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us Accept-Encoding: gzip, deflate Cookie: [myaccount]_wikidb_wiki_UserID=1; [myaccount]_wikidb_wiki_UserName=Admin; [myaccount]_wikidb_wiki_Token=9290158aeb7238330ad2858a8b0721ab; [myaccount]_wikidb_wiki__session=6ee14b007035d78aa840f18a2cb08f31 Connection: keep-alive Host: www.[mydomain].com

Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff session_set_cookie_params: "0", "/", "", "", "1" Zend Optimizer detected; skipping debug_backtrace for safety. Unstubbing $wgMessageCache on call of $wgMessageCache->addMessages from unknown Zend Optimizer detected; skipping debug_backtrace for safety. Unstubbing $wgParser on call of $wgParser->setHook from unknown MimeMagic::__construct: loading mime types from /home/[myaccount]/public_html/mediawiki/includes/mime.types MimeMagic::__construct: loading mime info from /home/[myaccount]/public_html/mediawiki/includes/mime.info Connecting to localhost [myaccount]_wikidb... Connected Fully initialised Zend Optimizer detected; skipping debug_backtrace for safety. Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from unknown Language::loadLocalisation: got localisation for en from source Zend Optimizer detected; skipping debug_backtrace for safety. Unstubbing $wgOut on call of $wgOut->setArticleRelated from unknown Zend Optimizer detected; skipping debug_backtrace for safety. Unstubbing $wgLang on call of $wgLang->getCode from unknown Zend Optimizer detected; skipping debug_backtrace for safety. Unstubbing $wgUser on call of $wgUser->getOption from unknown Cache miss for user 1 Logged in from session MessageCache::load: Loading en... got from global cache Zend Optimizer detected; skipping debug_backtrace for safety. Zend Optimizer detected; skipping debug_backtrace for safety. Zend Optimizer detected; skipping debug_backtrace for safety. Zend Optimizer detected; skipping debug_backtrace for safety. Zend Optimizer detected; skipping debug_backtrace for safety. Fetching search data from http://192.168.0.1:8123/search/[myaccount]_wikidb/Council?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Http::request: GET http://192.168.0.1:8123/search/[myaccount]_wikidb/Council?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Zend Optimizer detected; skipping debug_backtrace for safety. Zend Optimizer detected; skipping debug_backtrace for safety. Zend Optimizer detected; skipping debug_backtrace for safety. OutputPage::sendCacheControl: private caching; ** Request ended normally Can anyone suggest a solution? Joezoo 20:42, 2 September 2008 (UTC)


 * I wish u the best of luck here Joezoo, I have been troubleshooting this issue for what seems to be like a year now, with not much luck. The users Bryan and Bisato were monumental in their assistance in helping me troubleshoot the issue, I think yr best bet to find out if u have the same issue I have been having since MediaWiki v1.9.1 all the way up-thru today is to follow Bryan's suggestions, which I can summarize here ;;


 * Open a terminal and change the "MediaWiki" base directory : (for me it was) cd /var/www/htdocs/wiki-svn06252008
 * Start a PHP debugging session: php maintenance/eval.php
 * See whether you get something with the following command: $sock = fsockopen('127.0.0.1', 8123); fwrite($sock, "GET /search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 HTTP/1.0\r\nHost: localhost\r\n\r\n"); print fread($sock, 8192);
 * If that returns proper results, test whether MediaWiki works properly using: print Http::get('http://127.0.0.1:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10');
 * If that still works try: print Http::get('http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10');


 * Take Bryan's suggestions but apply them to yr own MediaWiki install, u may need to go so far as enabling MediaWiki debugging to find out the specific paths for troubleshooting as I had to.... the moral of my story is I moved to Drupal, as the Lucene tool was essential for me and my MediaWiki project, but in all the time I've given to posting my problem here, and SEVERAL other places on the internet I never found a solution -- and as Bryan pointed out "The proper way is of course to fix the HTTP functions, but I don't know how we can do that. The other option is to write a new HTTP layer which will surely work." I believe he is right. I just am not the person to do such a thing due to inexperience in this area(s)! I continue to watch this page to this day hoping someone will come out with a solution as a Lucene-based tool like Extension:Lucene-search+Extension:MWSearch is EXACTLY what I have always been looking for, for a MediaWiki based wiki. good luck + peace --Agentdcooper 00:41, 3 September 2008 (UTC)
 * Thanks Agentdcooper. Here's the thing: I've got LuceneSearch working now. I can see the queries as they occur in my terminal window. When I disable it in my LocalSettings file, searching for terms like "the" return 0 results (as the default MW search can't find "stop words"), and when I enable it again the search works. Since $wgLuceneHost is set to my server's IP address, I can actually use the address in the MW log to see the results over HTTP. I guess really, for me, the thing that's not working is $wgEnableMWSuggest -- the list of dropdown results that appear automatically as you type a search phrase. Anyone know what I'm doing wrong? --Joezoo 14:47, 3 September 2008 (UTC)
 * Never mind! I got it working! Turns out, when I updated from the last version of MediaWiki, I didn't copy the new contents of skins/common, which included the mwsuggest and new ajax files. But now it's working! --Joezoo 15:38, 3 September 2008 (UTC)
 * @Joezoo WOW! very cool! u are giving me a glimmer of hope here =)
 * question for you, are u using mediawiki + lucene-search2 daemon and mwsearch and a single host environment, ie. are u running the apps all on the same server?
 * if you answer yes, could I can convince you to tell me which versions of the following apps/software you are running?
 * version of Lucene-search2 daemon, MWSearch extension, which operating system/disto + version u have and last but not least - what version of curl do u have? by answering these questions, I will try to duplicate yr work and see if I myself can get it working, I've much been wanting to get this whole setup running, and maybe by duplicating yr setup/config I could get do it, if u wouldn't mind answering a few little questions. peace --Agentdcooper 18:45, 3 September 2008 (UTC)
 * Not sure I can help, but I just sent you an email. --Joezoo 21:37, 3 September 2008 (UTC)

Solution
zaera 15:28, 27 July 2010 (UTC) : If you are to run the lucene indexer on the same host as MW, simply hardcode the server's IP address -not localhost- (192.168.1.10 in the example below) in the MWSearch settings: $wgSearchType = 'LuceneSearch'; $wgLuceneHost = '192.168.1.10'; $wgLucenePort = 8123; require_once("extensions/MWSearch/MWSearch.php");
 * 1) Enable MWSearch

another solution from peter.sun@gmail.com

Although this is a old thread, I think my experience may do some help to other people. I encountered the problem few days ago, after looking carefully at error messages lsearchd generated, I find this line： 1217 [Thread-2] FATAL org.wikimedia.lsearch.frontend.SearchServer - Error: bind error: Address already in use I realize that lsearchd is a shell script which calls java to run a .jar file, kill lsearchd isn't enough, you have to kill all relevant java processes, if no other program uses java, a "killall java" would be enough.

Possible solution
I had the same problem--however, in my case, it was due to the fact that I'm behind a firewall, and therefore have $wgHTTPProxy set. Since my Lucene server wasn't being recognized as a local URL, it was attempting to use the proxy (which, needless to say, couldn't find it). I solved this by adding: $wgConf->localVHosts[] = 'wikey-local';

Unfortunately, there appears to be a bug in HttpFunctions.php. If the URL is local, it still tries to use localhost as a proxy! Commenting out the following line: curl_setopt( $c, CURLOPT_PROXY, 'localhost:80' ); caused everything to then work like a charm. --Cmreigrut 19:24, 21 November 2008 (UTC)

I've had many of the same problems mentioned here where the search results from Lucene weren't reflected in the search results. For example, the "Did you mean" suggestions would not come up. I was able to trace the problem to Step 2 of the installation instructions, where I changed the line $wgLuceneHost = '192.168.0.1'; to: $wgLuceneHost = 'localhost'; Basically, my gateway is not 192.168.0.1, so I can't connect. I assume that just using 'localhost' is a more robust solution? If so, can we update the main extension page to reflect this? Mr3641 13:56, 10 March 2010 (UTC)

Indexer thread problem
Running MW 1.13 + MWSearch r36482 + Lucene-search 2.1.1 on Ubuntu.

I've been able to set up MWSearch for search a build complete index but indexer thread doesn't update/delete pages from index. After running MWSearchUpdater::updatePage or MWSearchUpdater::deletePage I'm getting following errors at lsearchd debug screen:

232567977 [XML-RPC-0] ERROR org.apache.xmlrpc.server.XmlRpcStreamServer - execute: Error while performing request org.apache.xmlrpc.server.XmlRpcNoSuchHandlerException: No such handler: searchupdater.updatePage at org.apache.xmlrpc.server.AbstractReflectiveHandlerMapping.getHandler(AbstractReflectiveHandlerMapping.java:195) at org.apache.xmlrpc.server.XmlRpcServerWorker.execute(XmlRpcServerWorker.java:42) at org.apache.xmlrpc.server.XmlRpcServer.execute(XmlRpcServer.java:83) at org.apache.xmlrpc.server.XmlRpcStreamServer.execute(XmlRpcStreamServer.java:182) at org.apache.xmlrpc.webserver.Connection.run(Connection.java:175) at org.apache.xmlrpc.util.ThreadPool$MyThread.runTask(ThreadPool.java:71) at org.apache.xmlrpc.util.ThreadPool$MyThread.run(ThreadPool.java:87)

Interesting thing is that some messages like MWSearchUpdater::getStatus go through fine.

I'm downloading source code to look at it, but maybe someone has an idea what's happening already.

--Eugenem 06:05, 15 April 2009 (UTC)

Using fuzzy search by default
Hi there. Is there a way to use fuzzy search for all phrases typed into the search box? The fuzzy search feature is just so useful, I don't want to have to suffix all my searches with a "~". Thanks! Soonshort 06:17, 22 April 2009 (UTC)
 * Not out-of-the-box, but it should be easy to add an option to rewrite all the words in queries with ~. This would presumably be done somewhere in LuceneSearchSet::newFromQuery before the query is passed to the backend. --Rainman 09:24, 22 April 2009 (UTC)

Using Dictionary
Is it possible to use a dictionary additional to the Lucene soundex function?
 * No, not at the moment. Dictionaries also typically do not contain word frequency information, needed for our spelling suggestions (to suggest more common words instead of obscure words the user probably didn't want to type).--Rainman 17:53, 15 September 2009 (UTC)

Splitting Lucence Search Functions
Has anyone ever tried to split the lsearchd (index provider) and the database export/indexing functions across multiple boxes?
 * Yes..? --Rainman 17:50, 15 September 2009 (UTC)
 * What were the details of the split? Did it improve speed? --Stringhamdb 11:08, 16 September 2009 (UTC). s
 * No, any splitting reduces speed of single search because multiple indexes need to be searched (searching is not done in parallel). However, obviously more separate parallel searches can be handled this way.. --Rainman 17:58, 16 September 2009 (UTC)

What subset of query syntax is supported?
Exactly what subset of Lucene's query syntax is supported within MediaWiki? I see that the  wildcard works, and fuzzy searches with , but boolean AND, OR, and NOT seem to have no effect. Maiden taiwan 20:42, 17 September 2009 (UTC)
 * What exactly doesn't work with boolean queries? They seem fine to me Special:Search/mwsearch AND peanuts, Special:Search/mwsearch OR peanuts, Special:Search/mwsearch -lucene. --Rainman 22:20, 17 September 2009 (UTC)
 * Here is a test. I created three articles containing nonsense words.
 * Test1 contains "Frizzelschnitz, beemp."
 * Test2 contains "Frizzelschnitz."
 * Test3 contains "Beemp."
 * Look what happens when I search:
 * : returns hits for Test1, Test2 (good)
 * : returns hits for Test1, Test3 (good)
 * : returns a hit for Test1 (good)
 * : returns no hits (bad)
 * : returns no hits (bad)
 * : returns no hits (bad)
 * If I add the word "and" to Test1, then the last three searches all return Test1. Clearly, Lucene is matching the "and" literally, not as a boolean keyword. Lucene is definitely running because I see the "Did you mean..." at the top. Maiden taiwan 16:40, 18 September 2009 (UTC)
 * works as expected. I've been using the syntax listed at http://lucene.apache.org/java/2_1_0/queryparsersyntax.html. Maiden taiwan 16:42, 18 September 2009 (UTC)
 * and/or needs to be uppercase to be threaded as a keyword. Also, lucene-search uses a custom parser, not the default Lucene one.--Rainman 19:45, 18 September 2009 (UTC)
 * I tried uppercase "AND" and "NOT" but got exactly the same results as above.... FYI,   and   are working so this problem seems limited to the booleans.
 * Where is the lucene-search "custom parser" syntax documented, so I can teach our users? I see bits of documentation in the OVERVIEW.txt file in lucene-search, section "Query Parser." Any other docs? Thanks. Maiden taiwan 20:20, 18 September 2009 (UTC)


 * Not sure what is wrong with your installation, but it works for us... --Rainman 04:28, 19 September 2009 (UTC)

Getting rid of "all:" in search results link?
I'm wondering if there is a fix or workaround for the following case. When Lucene suggests, "Did you mean...," sometimes it prepends  to the message:

Did you mean: all:Fonebone

When you click "all:Fonebone" it does another search, which is fine. But... on that second search page, the system message MediaWiki:Noexactmatch says:

There is no page titled "$1". You can create this page.

which comes out as:

There is no page titled "all:Fonebone". You can create this page.

This is the problem. Users who click the link will create "All:Fonebone" when they themselves never typed the word "All:" and they really they want to create "Fonebone". This leads to bogus wiki pages being created with "All:" in their titles.

Is there a way to eliminate the "All:" so MediaWiki:Noexactmatch will work? You can't just strip off the namespace from $1 because the user might intentionally provide one.

--Maiden taiwan 20:29, 18 September 2009 (UTC)


 * Cannot reproduce ... If you manage to isolated the case when this happens and provide steps to reproduce on a independent installation please submit a bug report. --Rainman 04:33, 19 September 2009 (UTC)


 * What is supposed to be the correct behavior of the "Did you mean..." feature when you search for (say) Foo?
 * Did you mean: all:foobar  (linked to
 * Did you mean: foobar  (linked to ...?)
 * something else?
 * Maiden taiwan 15:05, 21 September 2009 (UTC)


 * This is now filed as bug 20766. It happens only if the user has checked all of the namespace checkboxes in My Preferences ("Search" tab). Maiden taiwan 15:14, 22 September 2009 (UTC)

Why are existing pages appearing as red links instead of blue?
I've installed MWSearch with Lucene. Now in the search the pages that exist are shown with red links. Specifically, if I search for "abc" then the results will show e.g. "abcdef" with the "abc" part in red and the "def" part in blue. How can I get it all in blue, like on Wikipedia? --Robinson Weijman 11:30, 28 September 2009 (UTC)

incategory and transclusion?
Is the "incategory" keyword supposed to work if an article is assigned its category by transclusion? For example, if you have an article A that transcludes the template  undefined , and template T contains, should article A be found when you search for  ?

The answer seems to be no. Maiden taiwan 16:17, 14 December 2009 (UTC)

Getting search results
Hi - how can I see what people are searching for? And how can I work out how good the searching is e.g. % hits (page matches) / searches? --Robinson Weijman 14:50, 25 January 2010 (UTC)

Setting minimum word length for lucene search
How do I configure mediawiki with lucene index to include 3 character words in the search? It seems to only search for 4 or more letter words. e.g. searching for "php" (http://bemoko.com/wiki/Special:Search?search=php&fulltext=search) gives no results, where searching for "live" gives the following results (http://bemoko.com/wiki/Special:Search?search=live&fulltext=search). Thx --Ian Homer 09:58, 2 February 2010 (UTC)


 * There is no such limit in lucene-search and this is very strange ... Can you see 3-character searches show up in the lucene-search log? --Rainman 11:32, 2 February 2010 (UTC)


 * Yes search term is in logs, however 3 character search term seems to be appended by "u800", e.g. query:/search/wiki/cssu800 . 4 character search terms are coming through as is, e.g. query:/search/wiki/html ... (see logs below) ... strange --Ian Homer 12:13, 2 February 2010 (UTC)

1184845022 [pool-2-thread-43] INFO org.wikimedia.lsearch.frontend.HttpHandler - query:/search/wiki/cssu800?namespaces=0&offset=0&limit=20&version=2.1&iwlimit=10&searchall=0 what:search dbname:wiki term:cssu800 1184855961 [pool-2-thread-44] INFO org.wikimedia.lsearch.frontend.HttpHandler - query:/search/wiki/html?namespaces=0&offset=0&limit=20&version=2.1&iwlimit=10&searchall=0 what:search dbname:wiki term:html


 * It might be that your MWSearch version is somehow incompatible with your MediaWiki .. This looks like some hackish query rewriting put in as a workaround for mysql 3-char search limitation. When downloading MWSearch use the "Download snapshot" option on Extension:MWSearch and make sure you pick the right MediaWiki version. --Rainman 13:57, 2 February 2010 (UTC)


 * Thanks - that sorted it. I must have downloaded the wrong version before.  Many thanks --Ian Homer 16:02, 2 February 2010 (UTC)

Fatal error with searching anything with colon
I am using MWsearch based on Lucene-Search 2.1.

Whenever I search for any keyword with colon (e.g. searching for "ha:"), I get the Fatal error: Fatal error: Call to undefined method Language::getNamespaceAliases in /var/www/html/wiki/extensions/MWSearch/MWSearch_body.php on line 96

It's the same thing with searching anything like "all:something" and "main:something".

Any idea is appreciated. --Ross Xu 20:10, 10 February 2010 (UTC)


 * You need to download MWSearch version that matches your MediaWiki version. Use "Download snapshot" link on the extension page. --Rainman 11:43, 11 February 2010 (UTC)


 * That did the trick! I don't why I installed a wrong version of MWSearch. Thanks a lot. --Ross Xu 16:39, 11 February 2010 (UTC)

Should I expect the search to do the following...
Hi, I'm doing some testing on the search functionality and appear to be having problems. If I do a search for tes (no * after the word) should I expect anything with the word test appear? Running Lucene-Search 2.1 and 1.15.1 MediaWiki (also have the appropriate MWSearch version for that Wiki. Thanks a lot. --barramya 12:55, 12 February 2010 (UTC)
 * No. This kind of search is supported only by mysql like queries which are terribly slow and give a high false positive rate for any wiki larger than a couple of pages. --Rainman 13:39, 12 February 2010 (UTC)

Multilanguage search in a farm?
I am creating a wiki farm for 5 different languages: en.mywiki.com, fr.mywiki.com, de.mywiki.com, etc. It's all on a single webserver and database, but each language uses a different  (,  ,  , etc.). What is the best way to use Lucene and MWSearch in this system?


 * Scenario 1: Each wiki can search only the content in its database tables. So the English wiki returns search results only from the  tables, the German wiki from the   tables, etc.  So this is similar to wikipedia.org. How would we configure this? Does Lucene need to run 5 different index operations, or can it all be done with a single index?


 * Scenario 2: A search on any wiki returns results from all languages. So a single search index, shared among all the wikis. How would we configure this?

Thank you. Maiden taiwan 15:50, 26 May 2010 (UTC)


 * DB prefixes are not supported by lucene-search, neither is searching in multiple wikis (except for limited interwiki title hits). One indexer/searcher however can handle many different wikis, as long as they are withing different databases. --Rainman 17:30, 26 May 2010 (UTC)


 * Thank you. So if I understand correctly, we need to put each wiki into a separate database, then have a single Lucene instance index them all together. Correct?  Once we do that:
 * Is there a way for each wiki to see search results only from its own database, or will all database results be mixed together?
 * Search on wikipedia wouldn't work if all of the results would be mixed together. --Rainman 10:15, 27 May 2010 (UTC)
 * If there's no way to do the previous thing out-of-the-box, we could write an extension. In a multi-database setup, does lucene-search return information about which database (or which wiki) each hit came from, which our extension could utilize?
 * You need a database name to query the search index, so it is part of the query, and not returned results. --Rainman 10:15, 27 May 2010 (UTC)
 * Can you point me to an example of configuring a multi-database setup with lucene-search?
 * Thanks. Maiden taiwan 19:59, 26 May 2010 (UTC)
 * See the Wikimedia setup. --Rainman 10:15, 27 May 2010 (UTC)

Search through multiple wiki databases
We have an internal and an external wiki. Both are completely independant Mediawikis.

What I want to do is, that search results in the interal wiki should contain results from both the internal and the external wiki.

Is there any way to do that? If not, what would be the best way to implement that.

Thanks, Christoph --80.123.158.221 15:06, 11 November 2010 (UTC)


 * You could set up Apache Solr to index both of your MediaWiki instances. I have done that for a set of data sources at our company and it works quite well. The way I solved it was to create Dataimporthandlers for both MW instances, based on nightly XML dumps. I'm sure there are multiple ways of solving this, though! --zaera 20:32, 11 November 2010 (UTC)

Search in Categorys
Hi, is it possible to search in some selected Categorys? SBachenberg 08:19, 13 April 2011 (UTC)


 * You can, but it doesn't work well. You can add:
 * in the search box. This limits the search to articles that directly contain the category tag . However, if an article is categorized by transclusion (say, it transcludes a template that contains  ), the   feature will not find it. This makes   unreliable, since it returns incomplete results. This is my #1 issue with using Lucene/MWSearch, which are otherwise excellent tools. Maiden taiwan 13:53, 13 April 2011 (UTC)
 * in the search box. This limits the search to articles that directly contain the category tag . However, if an article is categorized by transclusion (say, it transcludes a template that contains  ), the   feature will not find it. This makes   unreliable, since it returns incomplete results. This is my #1 issue with using Lucene/MWSearch, which are otherwise excellent tools. Maiden taiwan 13:53, 13 April 2011 (UTC)

link to page containing document
I am maintaining a wiki on a local intranet. Could you tell me if MWSearch search will show the link to the wiki page which contains an indexed document or if it only links to the indexed document.

Our current search will find a phrase in 'aaa.pdf' but when you click the link, you can only open 'aaa.pdf', you cannot see where 'aaa.pdf' is linked on the wiki.

Thanks.

Finding phrases in documents
How are you "search will find a phrase in 'aaa.pdf'"?

Problem with MWSearch / Lucene / dumpBackup.php
Hi,

I recently updated my Wiki from version 1.17 to 1.19. Since then, I have problems with the generation of the search index for MWSearch: very often and at random points after the Lucene build-script called dumpBackup.php, the database (mysql) 'stalls', the script stops with "A database error has occurred." and the wiki is not working anymore. I have to restart the mysql server to get the wiki running again.

Code: echo "Dumping $dbname..." cd $mediawiki && php maintenance/dumpBackup.php \ $dbname \ --conf $mediawiki/LocalSettings.php \ --aconf $mediawiki/AdminSettings.php \ --current \ --server=$slave > $dumpfile

All extensions (incl. MWSearch) are up-to-date, the Lucene scripts are updated, I updated the database after the upgrade (update.php) and the database is fine, no errors reported by the appropriate tools.

Running dumpBackup.php (without slave and conf) from the command line works fine.

Thanks for any help, Udo

Wildcard is not working
Hello,

I have a problem with the wildcard search.

If I search for a word like "T2FormInfoPos302", the search index get one result. But, if I search with a wildcard like "*Pos302", i dont get any results. Does anyone knows a good solution?

Thanks a lot.

MWSearch / GSA7 integration
I work for a large company with 50+ internal websites. We support a number of different content types including MediaWiki, Oracle UCM, Jive and even Sharepoint, and our overall search mechanism is driven by GSA7. The engineering portion of our company is partial to MediaWiki, but the integration of it and GSA seems kind of weak, or at least we don't seem to find what we want easily. I am told we are upgrading to MediaWiki 1.19.1 with the MWSearch extension and was wondering if anyone could give me any ideas about how to make MWSearch and GAS play better together ? Thanks

Support for multiple page sections in results
It appears that, when searching for a term that appears in multiple sections of a page, the section links for the page results map to the highest matching section but ignore any other high matching sections on that page.

As an example, I have an internal wiki setup to act as a knowledgebase and, when searching for "Macro", the expected software page appears as the top result with a section link to the highest matching section, but there are a few other sections relating to other macro items on that page that are not mentioned or linked at all. I realize one could conceivably lump those sections together under a single "macro" heading in this case, but that seems to be an inherently limited solution that would not necessarily be appropriate in other similar instances.

Is there any way to have the top page result list multiple sections above a certain match threshold, capped at some certain count? Say, up to 3 section results with >90% match probability? Or even repeats of the page match, each with another high matching section link?

Seems like it could be a useful feature.

Timothy.russ (talk) 00:08, 19 February 2013 (UTC)