Extension talk:MWSearch

What are the advantages/disadvantages of MWSearch extension over Lucene extension ? Thank you.

MWSearch is being maintained and developed, while LuceneSearch is not. The difference are minor. MWSearch has a somewhat better and more consistent interface, e.g. it has galleries for Image namespace hits. --Rainman 15:01, 24 April 2008 (UTC)

Does MWSearch support Windows + Apache / Windows + IIS ?
 * No, the search backend does not support windows. --Rainman 13:52, 12 May 2008 (UTC)
 * thats not true; i just managed to compile it under ubuntu and now running it on a windows 2003 server --193.27.220.82 13:55, 28 May 2008 (UTC)

MWSearch over SSL?
Is it possible to use MWSearch over SSL? If so, is there a special configuration needed? (I ask b/c I've been unable to get it working and I'm not permitted to open additional ports)
 * Sorry to answer my own question, but I was able to make this work by editing my host file to point the domain to 127.0.0.1 instead of the external IP. I don't really like this solution so I'm hoping this isn't the preferred method. Does anyone have any suggestions?

MediaWiki_SVN+Lucene-Search2_SVN+MWSearch_SVN = ZERO search results
I've been trying for a very long time now to implement Lucene-search functionality on my MediaWiki site -- I've spent Hours+Days+Weeks+Months troubleshooting this very issue - I'm giving this one last effort, I am hoping SOMEONE will be able to help me, otherwise I am thinking of moving away from MediaWiki entirely (which would make me a very sad panda), and will painstakingly import my MediaWiki into Drupal - then try out their neat sounding-module called Drupal::Search_Attachments module. I have put in over 1 year's worth of work into my MediaWiki-based wiki. I truly need the solution that Extension:Lucene-search+Extension:MWSearch offers, but it has been next to impossible for me to implement on my Slackware server. Who knows, my trouble could very well be something I am doing wrong! I'm human enough to admit that if it turns out to be the case... This is why I come for assistance, I am thinking of posting to the MediaWiki forums too, and linking here (hope that's OK), as I really, really want to get Extension:Lucene-search+Extension:MWSearch working for me! I have been trying to get Extension:Lucene-search+Extension:MWSearch working for over 6months now, and posted alot of my previous issues HERE at the MediaWiki Lucene-Search Talk page, but was unable to find any solutions to my problem. I even tried the newer+more-up-to-date Extension:Lucene-search+Extension:MWSearch setup to no avail. I am BACK, now with a brand new computer (well, it's actually older-hardware, but newly formatted hard-drive), a freshly installed OS, a very basic website with some basic data input to test search functionality This is my current overall system setup; I've gone over and over and OVER the directions per Extension:Lucene-search and Extension:MWSearch pages, I just cannot get this working properly on my box, now that I tried on a new install with new everything, I am convinced this is not a problem on my end, but I could be wrong --- I have documented EVERYTHING I did from install, to now since I've been over this so many times, maybe by me posting my logs here and what I did from begining to end, someone might "see something" I'm missing?? Please HELP! =)
 * Slackware 12.1, on i686 Pentium III (Linux 2.6.24.5-smp = Slackware 12.1's generic-smp-2.6.24.5-smp kernel)
 * MediaWiki: 1.13alpha (SVN 06-25-2008)
 * PHP: 5.2.6 (I used Slackware 12.1's PHP v5.2.6 update package)
 * MySQL: 5.0.51b
 * MediaWiki Extension(s): [Extension:lucene-search|MWSearch]] SVN 06-25-2008, and Lucene-search2 SVN 06-25-2008, + I downloaded & installed mwdumper.jar into the Lucene-search2 "lib" dir = /usr/local/search/ls2
 * other tools: jre-6u6-i586-3, jdk-1_5_0_09-i586-1, apache-ant-1.7.0-i486, rsync-3.0.2-i486-1

SVN install of MediaWiki
> svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/phase3
 * Checked out revision 36630
 * I moved the everything to my htdocs folder using NEW directory = /var/www/htdocs/wiki-svn060108

> chmod a+x /var/www/htdocs/wiki-svn060108/config
 * I ran first time config via http:// /wiki-svn060108/config (configured it as follows) ;;


 * wikiname: NOC Archive
 * contact email: rprior@newedgenetworks.com
 * language: en - english
 * license:  GNU Free Documentation License 1.2 (Wikipedia-compatible)
 * admin username: rprior
 * admin password: xxxxxxxxx
 * wikiDB name: svnwikidb
 * DB username: svnwikiuser
 * DB password: xxxxxxxxx
 * database character set:  Experimental MySQL 4.1/5.0 UTF-8


 * I created the MySQL DB and gave myself permissions

> mysql -u root -p mysql> create database svnwikidb character set utf8; Query OK, 1 row affected (0.00 sec)

mysql> GRANT SELECT,INSERT,UPDATE,DELETE,CREATE,DROP -> ON svnwikidb.* -> TO 'svnwikiuser'@'localhost' -> IDENTIFIED BY 'xxxxxxxx'; Query OK, 0 rows affected (0.03 sec)

mysql> exit


 * moved /var/www/htdocs/wiki-svn060108/config/LocalSettings.php TO /var/www/htdocs/wiki-svn060108/

> chown root:apache LocalSettings.php > chown 700 LocalSettings.php > rm -r /var/www/htdocs/wiki-svn060108/config


 * pulled up my new wiki via the page http://nen-tftp.techiekb.com/wiki-svn06252008/index.php/Main_Page = and IT WORKS!


 * I put some basic data that I knew would be searchable on the front/1st page

Installtion of LuceneSearch2+MWSeach extensions
> cd /var/www/htdocs/wiki-svn06252008/extensions > svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/MWSearch

> ln -s /usr/lib/jdk1.5.0_09/lib/tools.jar /usr/lib/java/lib

> ls -al /usr/lib/java/lib/ lrwxrwxrwx 1 root root       34 Jun 25 03:19 tools.jar -> /usr/lib/jdk1.5.0_09/lib/tools.jar

> cd /tmp > mkdir lucene-search-2 > cd lucene-search-2/ > svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/lucene-search-2/ > mkdir /usr/local/search > mkdir /usr/local/search/ls2 > cd /usr/local/search/ls2 > mv /tmp/lucene-search-2/lucene-search-2/* ./ > cd /usr/local/search/ls2/lib > wget http://download.wikimedia.org/tools/mwdumper.jar > mkdir /usr/local/search/indexes > cd /usr/local/search/ls2

lsearch.conf - my configuration file build
MWConfig.global=file:///etc/lsearch-global.conf MWConfig.lib=/usr/local/search/ls2/lib Indexes.path=/usr/local/search/indexes Search.updateinterval=1 Search.updatedelay=0 Search.checkinterval=30 Index.snapshotinterval=5 Index.maxqueuecount=5000 Index.maxqueuetimeout=12 Storage.master=localhost Storage.useSeparateDBs=false Storage.defaultDB=lsearch Storage.lib=/usr/local/search/ls2/sql SearcherPool.size=3 Localization.url=file:///var/www/htdocs/wiki-svn06252008/languages/messages OAI.username=user OAI.password=pass OAI.maxqueue=5000 Logging.logconfig=/etc/lsearch.log4j Logging.debug=true
 * I created a symlink for /etc/lsearch.conf that points to the actual file = /usr/local/search/ls2/lsearch.conf

ln -s /usr/local/search/ls2/lsearch.conf /etc

/etc/lsearch.log4j - my configuration file build
log4j.rootLogger=INFO, A1 log4j.appender.A1=org.apache.log4j.ConsoleAppender log4j.appender.A1.layout=org.apache.log4j.PatternLayout log4j.appender.A1.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n

/etc/lsearch-global.conf - my configuration file build
[Database] svnwikidb : (single) (language,en) (warmup,10) [Search-Group] nen-tftp : svnwikidb [Index] Database.suffix=wiki wiktionary svnwikidb KeywordScoring.suffix=svnwikidb wiki wikilucene wikidev ExactCase.suffix=svnwikidb wiktionary wikilucene [Namespace-Prefix] all : [0] : 0 [1] : 1 [2] : 2 [3] : 3 [4] : 4 [5] : 5 [6] : 6 [7] : 7 [8] : 8 [9] : 9 [10] : 10 [11] : 11 [12] : 12 [13] : 13 [14] : 14 [15] : 15

built LuceneSearch.jar via ANT
> ln -s /opt/apache-ant/bin/ant /bin > ant Buildfile: build.xml build: [mkdir] Created dir: /usr/local/search/ls2/bin [javac] Compiling 101 source files to /usr/local/search/ls2/bin [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. alljar: [jar] Building jar: /usr/local/search/ls2/LuceneSearch.jar BUILD SUCCESSFUL Total time: 24 seconds

LuceneSearch extension added to LocalSettings.php

 * I added the following to my /var/www/htdocs/wiki-svn060108/LocalSettings.php file ;

$wgSearchType = 'LuceneSearch'; $wgLuceneHost = 'localhost'; $wgLucenePort = 8123; require_once("extensions/MWSearch/MWSearch.php");

created a dumpBackup.sh script to automate building of my index
php /var/www/htdocs/wiki-svn06252008/maintenance/dumpBackupInit.php --current --quiet > wikidb.xml && java -cp /usr/local/search/ls2/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s wikidb.xml svnwikidb

> chmod 750 dumpBackup.sh

created file = dumpBackupInit.php

 * created file /var/www/htdocs/wiki-svn06252008/maintenance/dumpBackupInit.php with 755 permssions ;


 * 1) dumpBackupInit - Wrapper Script to run the mediaWiki xml-dump "dumpBackup.php" correctly
 * 2) $wgDBtype           = "mysql";
 * 3) $wgDBserver         = "localhost";
 * 4) $wgDBname           = "svnwikidb";
 * 5) $wgDBuser           = "svnwikiuser";
 * 6) $wgDBpassword       = "xxxxxxxx";
 * 7) $wgDBprefix         = "";
 * 8)  * $wgDBport           = "5432";
 * 9) @author: Stefan Furcht
 * 10) @version: 1.0
 * 11) @require: /srv/www/htdocs/wiki-svn06252008/maintenance/dumpBackup.php
 * 12) The following Variables musst be set, to get dumpBackup.php at work
 * 1) you'll find this Values in the DB-section into your mediaWiki-Config: LocalSettings.php
 * 2) XML-Dumper 'dumpBackup.php' requires the setted Vars to run
 * 3) simply include the original dumpBackup-Script
 * I then, ran my "dumpBackup.sh" file via command-line

/srv/www/htdocs/wiki-svn06252008/dumpBackup.sh
 * This creates an XML dump of my Wiki DB in a file called wikidb.xml, which seems to work JUST FINE, the file is 3.6Kb, which is pretty small, since I don't have much in my BRAND NEW WIKI, just some text I know will be easily found when the search function is working properly.

starting the lucene-search2 daemon
I start the lucene-search2 daemon using this command-line ; /usr/local/search/lucene-search-2svn05112008/lsearchd
 * The program loads, and spits out some information to the console I am logged into <'some' text follows>

RMI registry started. Trying config file at path /root/.lsearch.conf Trying config file at path /var/www/htdocs/wiki-svn06252008/lsearch.conf Trying config file at path /etc/lsearch.conf log4j: Parsing for [root] with value=[INFO, A1]. log4j: Level token is [INFO]. log4j: Category root set to INFO log4j: Parsing appender named "A1". log4j: Parsing layout options for "A1". log4j: Setting property [conversionPattern] to [%-4r [%t] %-5p %c %x - %m%n]. log4j: End of parsing for "A1". log4j: Parsed "A1" options. log4j: Finished configuring. 0   [main] INFO  org.wikimedia.lsearch.util.Localization  - Reading localization for En 2804 [main] INFO  org.wikimedia.lsearch.util.UnicodeDecomposer  - Loaded unicode decomposer 3068 [main] INFO org.wikimedia.lsearch.interoperability.RMIServer  - RMIMessenger bound 3351 [main] INFO org.wikimedia.lsearch.interoperability.RMIServer  - RemoteSearchable bound 3374 [Thread-2] INFO org.wikimedia.lsearch.frontend.HTTPIndexServer  - Started server at port 8321 3386 [Thread-3] INFO org.wikimedia.lsearch.frontend.SearchServer  - Binding server to port 8123 3407 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warming up index svnwikidb ... 4737 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up svnwikidb in 1330 ms 4738 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index svnwikidb ... 5629 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up svnwikidb in 891 ms 5630 [main] INFO  org.wikimedia.lsearch.search.Warmup  - Warming up index svnwikidb ... 6203 [main] INFO org.wikimedia.lsearch.search.Warmup  - Warmed up svnwikidb in 573 ms

My MediaWiki Special:Version page

 * My wiki's Special:Version page indicates that PHP, MySQL, and MWSearch are all being properly initialized/recognized by MediaWiki;;

- INSTALLED SOFTWARE - INSTALLED EXTENSIONS
 * MediaWiki 1.13alpha
 * PHP 5.2.6 (apache2handler)
 * MySQL 5.0.51b
 * MWSearch (Version r36482) - MWSearch plugin - Brion Vibber and Kate Turner

The actual PROBLEM is NO SEARCH RESULTS
Now that I have everything setup, and Lucene-search2-deamon running I tried to search on my website... Fingers crossed.... I type in a known word that IS on the front page, and is also in my XML dump of the MySQL DB (wikidb.xml) --- sure enough, I get ZERO SEARCH RESULTS!! I get this error in my MediaWiki search results page; Search results From AgentDcooper's Wiki You searched for wiki For more information about searching AgentDcooper's Wiki, see Searching AgentDcooper's Wiki. Showing below 0 results starting with #1. No page text matches Note: Unsuccessful searches are often caused by searching for common words like "have" and "from", which are not indexed, or by specifying more than one search term (only pages containing all of the search terms will appear in the result).

Troubleshooting the ZERO results issue
Since I have a console session open with lucene-search2 daemon running, I notice that AS SOON as I hit the SEARCH button after typing in my search phrase (loopback) in my MediaWiki search box, the lucene-search2 daemon console output scrolls the following; 893553 [pool-2-thread-1] INFO org.wikimedia.lsearch.frontend.HttpHandler  - query:/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 what:search dbname:svnwikidb term:loopback 893567 [pool-2-thread-1] INFO org.wikimedia.lsearch.search.SearchEngine  - Using NamespaceFilterWrapper wrap: {0} 893592 [pool-2-thread-1] INFO org.wikimedia.lsearch.search.SearchEngine  - search svnwikidb: query=[loopback] parsed=[contents:loopback (title:loopback^6.0 stemtitle:loopback^2.0) (alttitle1:loopback^4.0 alttitle2:loopback^4.0 alttitle3:loopback^4.0) (keyword1:loopback^0.02 keyword2:loopback^0.01 keyword3:loopback^0.0066666664 keyword4:loopback^0.0050 keyword5:loopback^0.0039999997)] hit=[1] in 12ms using IndexSearcherMul:1214736931039


 * I've been troubleshooting this issue for a long time, so I do know how to enable Mediawiki Debuging --- here is what my /var/log/mediawiki/debug_svn_log.txt shows ;;

Start request GET /wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Cookie: wikidbToken=dd6c9b732dba0c94b04ad72044d46d79; wikidbUserName=Rprior; wikidbUserID=2; wikidb_session=5rph1dsoik5dpdlcitc1canlr0; svnwikidb_session=n8btqun31sn6vnubiek79l5br6; svnwikidbUserID=1; svnwikidbUserName=Rprior; svnwikidbToken=baea562c5be4148475a179c94a6868d4 Authorization: Basic ZGNvb3Blcjp0ZXN0cGFzcw==

Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff session_set_cookie_params: "0", "/", "", "", "1" Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHeaders Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject Cache miss for user 1 Connecting to localhost svnwikidb... Connected Logged in from session MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Fetching search data from http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Http::request: GET http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 total [0] hits OutputPage::sendCacheControl: private caching; ** Request ended normally


 * That is the part I am having the most trouble with, IMHO!
 * Follow me here.... Everything actually seems to work, up until the 3rd to last line in the debug! The part that doesn't appear to be working properly is the 3rd from the bottom line =  total [0] hits.
 * This is my reason, WHY I think that is the case, If I actually pull up the web address on my webserver via lynx or any other webbrowser =  http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10  I get the following output!

1 1.0 0 Main_Page


 * That output appears to be saying that THERE IS 1 page that matches!!! The page being Main_Page??!! Does that sound right?? I suspect something MUST be wrong here, it has been pointed out that it may be my CURL library, but that was on an earlier version of Slackware (12.0, I am now running 12.1) - according to my Slackware 12.1 (plain vanilla install, except I upgraded to newer version of PHP) the CURL version/package I am using is curl-7.16.2-i486-1. I don't suspect that CURL is my problem, but I am completely open to anyone's interpretation of my issue at hand, and would love to work with someone on this, and/or come up with a solution... I think it's working, just MWSearch is not passing the data properly to my MediaWiki search???

Does anyone have any ideas here? Please help me, I really don't want to move away from MediaWiki, but I very much need this functionality from MediaWiki! Thanks in advance + sorry to be longwinded, just wanted to ensure I give as much details as possible - if you have any questions, feel free to ask!

PS :: I tried ExtensionFunctions.php SVN-06-25-2008
BTW, in my reading and troubleshooting, I saw something that said I should download the file ExtensionFunctions.php so I pulled this file down from SVN-06-25-2008 ;; > wget http://svn.wikimedia.org /svnroot/mediawiki/trunk/extensions/ExtensionFunctions.php

> mv ExtensionFunctions.php /var/www/htdocs/wiki-svn06252008

This did not resolve my issue at all, still seeing the same problem. ANYONE HAVE ANY IDEAS?
 * If you add wfDebug(print_r($data, true)); in MWSearch_body.php file right after the $data = Http::get( $searchUrl ); line, does that give something useful in your debug log? Or does it give a null? 83.81.5.126 15:35, 29 June 2008 (UTC)
 * Also putting a wfDebug("Raw results [$totalHits]\n"); before $totalHits = intval( $totalHits ); might give useful results 83.81.5.126 15:38, 29 June 2008 (UTC)


 * thanks for the help! so I tried both suggestions, but I obviously don't think I know how to understand the log, but it sure looks like it's put more info in there with your suggestions.


 * First I aded wfDebug(print_r($data, true)); in the extensions/MWSearch/MWSearch_body.php file, right after the $data = Http::get( $searchUrl ); line
 * my debug is now outputting the following upon a search that shows no results in my MediaWiki ;;

Start request GET /wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki-svn06252008/index.php/Main_Page Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHeaders Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject Connecting to localhost svnwikidb... IP: 24.20.24.50 Connected MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Fetching search data from http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Http::request: GET http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 total [0] hits OutputPage::sendCacheControl: private caching; ** Request ended normally Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=rss Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get IP: 24.20.24.50 MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=atom Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source RC: loading feed from cache (svnwikidb:rcfeed:rss:limit:50:minor:; 20080629225416; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Request ended normally Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get IP: 24.20.24.50 MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform RC: loading feed from cache (svnwikidb:rcfeed:atom:limit:50:minor:; 20080629225418; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** Request ended normally
 * I am going to try your 2nd suggestion now, again thanks for helping, it really means alot to me! Agentdcooper


 * OK, this is what my debug log shows when I put wfDebug("Raw results [$totalHits]\n"); before $totalHits = intval( $totalHits ); in the extensions/MWSearch/MWSearch_body.php file while leaving your 1st suggestion in = wfDebug(print_r($data, true)); ;

Start request GET /wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHeaders Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject Connecting to localhost svnwikidb... IP: 24.20.24.50 Connected MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Fetching search data from http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Http::request: GET http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Raw results [] total [0] hits OutputPage::sendCacheControl: private caching; ** Request ended normally

Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=rss Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=atom Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get IP: 24.20.24.50 MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform IP: 24.20.24.50 RC: loading feed from cache (svnwikidb:rcfeed:rss:limit:50:minor:; 20080629225416; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Request ended normally RC: loading feed from cache (svnwikidb:rcfeed:atom:limit:50:minor:; 20080629225418; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** Request ended normally


 * I hope that this helps, to me it sounds looks like the "Raw results" are nadda/zero, why would that be? I would love to know how to correct this if possible! any assistance you can provide is more then helpful! --Agentdcooper 01:02, 30 June 2008 (UTC)
 * I definitely think there is something wrong with HTTP::get. More specifically I think MediaWiki is doing something wrong with proxies... Try to disable CURL by replacing if ( function_exists( 'curl_init' ) ) { in includes/HttpFunctions.php with if ( function_exists( 'curl_init' ) && false) {.
 * The problem is probably due to the fact that you are running Lucene on localhost, while Wikimedia uses Lucene on foreign hosts and there might be a bug with HttpFunctions on localhost, but not on port 80. 83.81.5.126 18:09, 30 June 2008 (UTC)


 * Thank you for sticking around to help (whoever u are!) I am thrilled to see you still here to help me!!! I think you are dead-on, I was thinking it might be something with MWSearch extension, but by the sounds of it u are thinking it's an issue with the way MediaWiki uses HTTP:get? I like where yr going, and I am willing to do anything you suggest to try to isolate this, as I really, really want this to work, more than anything!


 * As u suggested, in file /var/www/htdocs/wiki-svn06252008/includes/HttpFunctions.php I changed line 25 FROM ;

if ( function_exists( 'curl_init' ) ) {
 * TO ;

if ( function_exists( 'curl_init' ) && false) {
 * I then saved the updated file (and made a backup of the original), did another update/import of my wiki DB (php maintenance/dumpBackupInit.php --current --quiet > wikidb.xml && java -cp /usr/local/search/ls2/LuceneSearch.jar org.wikimedia.lsearch.importer.Importer -s wikidb.xml svnwikidb  which, BTW, works just fine). From here I went to my wiki main page, and ran a search for a word that is on the Main_Page 3 times, and also is in my XML dump of my wiki DB too, 3 times.... The search came up with this error (again, soo sad) ;

You searched for loopback For more information about searching NOC Information Archive, see Help. No page text matches


 * This is what the Lucene-Search-2 daemon (lsearchd) console scrolled by after running my search ;

1384087 [pool-2-thread-6] INFO org.wikimedia.lsearch.frontend.HttpHandler  - query:/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 what:search dbname:svnwikidb term:loopback 1384089 [pool-2-thread-6] INFO org.wikimedia.lsearch.search.SearchEngine  - Using NamespaceFilterWrapper wrap: {0} 1384093 [pool-2-thread-6] INFO org.wikimedia.lsearch.search.SearchEngine  - search svnwikidb: query=[loopback] parsed=[contents:loopback (title:loopback^6.0 stemtitle:loopback^2.0) (alttitle1:loopback^4.0 alttitle2:loopback^4.0 alttitle3:loopback^4.0) (keyword1:loopback^0.02 keyword2:loopback^0.01 keyword3:loopback^0.0066666664 keyword4:loopback^0.0050 keyword5:loopback^0.0039999997)] hit=[1] in 4ms using IndexSearcherMul:1214853636412


 * I left the previous suggestions you gave me yesterday (or the day before?) regarding debug suggestions, so this is what my /var/log/mediawiki/debug_svn_log.txt shows after I ran my search ;

Start request GET /wiki-svn06252008/index.php/Special:Search?search=loopback&fulltext=Search Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Referer: http://nen-tftp.techiekb.com/wiki-svn06252008/index.php?title=Special%3ASearch&search=all%3Aloopback&ns0=1&fulltext=Search Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setArticleRelated from SpecialPage::setHeaders Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgUser on call of $wgUser->getOption from StubUserLang::_newObject Connecting to localhost svnwikidb... IP: 24.20.24.50 Connected MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Fetching search data from http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Http::request: GET http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 Raw results [] total [0] hits OutputPage::sendCacheControl: private caching; ** Request ended normally Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=rss Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Start request GET /wiki-svn06252008/index.php?title=Special:RecentChanges&feed=atom Host: nen-tftp.techiekb.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9) Gecko/2008052906 Firefox/3.0 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Main cache: FakeMemCachedClient Main cache: FakeMemCachedClient Message cache: MediaWikiBagOStuff Parser cache: MediaWikiBagOStuff Fully initialised Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Unstubbing $wgContLang on call of $wgContLang->checkTitleEncoding from WebRequest::getGPCVal Language::loadLocalisation: got localisation for en from source Language::loadLocalisation: got localisation for en from source Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Unstubbing $wgOut on call of $wgOut->setSquidMaxage from SpecialRecentChanges::execute Connecting to localhost svnwikidb... Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified Connected Unstubbing $wgUser on call of $wgUser->getOption from OutputPage::checkLastModified OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey OutputPage::checkLastModified: client did not send If-Modified-Since header Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get Unstubbing $wgMessageCache on call of $wgMessageCache->get from wfMsgGetKey IP: 24.20.24.50 MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Unstubbing $wgLang on call of $wgLang->getCode from MessageCache::get IP: 24.20.24.50 RC: loading feed from cache (svnwikidb:rcfeed:rss:limit:50:minor:; 20080629225416; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** MessageCache::load: Loading en... got from global cache Unstubbing $wgParser on call of $wgParser->firstCallInit from MessageCache::transform Request ended normally RC: loading feed from cache (svnwikidb:rcfeed:atom:limit:50:minor:; 20080629225418; 20080629103756)... RC: Outputting cached feed OutputPage::sendCacheControl: private caching; Sun, 29 Jun 2008 10:37:56 GMT ** Request ended normally


 * So again, I am able to pull up the link listed in the debug file, with the line beginning with "Http::request: GET" = http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 - this pulls up the following (which sure seems like at least THAT part of Lucene search capabilities is working!);

1 1.0 0 Main_Page


 * I really think you are definately onto something here, I am just not sure where to go next here... any more assistance you can give is much appreciated, I think others may run into this same issue, so who knows maybe you helped identify a bug?! thanks again! --Agentdcooper 19:58, 30 June 2008 (UTC)
 * (the ip was me) Ok this is really strange. From this point I'm very sure that something goes wrong with MediaWiki's HTTP fetching capabilities. I just don't know what. Two more ways to test this (you don't need to reinstall anything or something else):


 * Open a terminal and change the directory to your wiki base: cd /var/www/htdocs/wiki-svn06252008
 * Start a PHP debugging session: php maintenance/eval.php
 * See whether you get something with the following command: $sock = fsockopen('127.0.0.1', 8123); fwrite($sock, "GET /search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 HTTP/1.0\r\nHost: localhost\r\n\r\n"); print fread($sock, 8192);
 * If that returns proper results, test whether MediaWiki works properly using: print Http::get('http://127.0.0.1:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10');
 * If that still works try: print Http::get('http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10');
 * I hope we get something out that. Bryan 20:52, 30 June 2008 (UTC)


 * Excellent! That first suggestion seemed to actually give me valid results!;

<(root@nen-tftp:/var/www/htdocs/wiki-svn06252008)> cd /var/www/htdocs/wiki-svn06252008 <(root@nen-tftp:/var/www/htdocs/wiki-svn06252008)> php maintenance/eval.php > $sock = fsockopen('127.0.0.1', 8123); fwrite($sock, "GET /search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10 HTTP/1.0\r\nHost: localhost\r\n\r\n"); print fread($sock, 8192); HTTP/1.1 200 OK Content-Type: text/plain 1 1.0 0 Main_Page


 * Now I'm not 100% sure what it is you are asking me to do with the other suggestions regarding the print function, I just tried to issue the print function while still in php maintenance/eval.php context, after I issued the above, I never exited from the same php session/command-line, and continued on with the following, which did not return anything (I just wasn't sure if that was what you meant for me to do...) ;

> print Http::get('http://127.0.0.1:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10'); > print Http::get('http://localhost:8123/search/svnwikidb/loopback?namespaces=0&offset=0&limit=20&version=2&iwlimit=10'); >


 * If I did that incorrectly, please tell me =)   I, though, think I did what u requested, and it looks like using Http::get with either 127.0.0.1, or just localhost, MediaWiki is reporting back ZERO results =(   Anything else you wanna have me try, again I am totally here for anything you suggest! Thanks much. peace --Agentdcooper 01:44, 1 July 2008 (UTC)
 * Yeah that is what I asked. Ok we can conclude from this that: 1) PHP can connect to Lucene properly and 2) Your HTTP fetch capabilities are broken. I'm not sure what we can do about it. The proper way is of course to fix the HTTP functions, but I don't know how we can do that. The other option is to write a new HTTP layer which will surely work. Bryan 08:45, 1 July 2008 (UTC)


 * Thank you Bryan - you are an tireless Helper and a Saint!! I thank you for your assistance, it looks like you nailed it --- at least PHP is working properly, I wonder if others users out there are running into this issue as well? I've tried multiple computers yet my problem stayed the same (detailed all above) --- the only thing that was similar on my different computers was/is my base-linux distro = Slackware. Yet, I have gone thru brand new installs of each Slackware OS version 11.x thru the recently released stable Slackware v12.1 - I think what you are saying is that it's not my OS that is the problem, but MediaWiki's HTTP layer? I highly doubt I am sophisticated enough to rewrite a new HTTP layer/HTTP functions for MediaWiki, so I may just sadly have to move away from my beloved MediaWiki... I was afraid of this.


 * Again Bryan and everyone else who has helped me, you have been extremely self-less and helpful to me -- I just don't know if I will ever run into someone of your caliber in the Drupal community (even though I know Drupal does have a thriving FS/OSS community) the next time I run into (let's say) a Durpal problem - as your knowledge has been instrumental in helping me understand the problem at hand here. I hope the MediaWiki guru's read this, but I kinda doubt it....


 * Where to go from here, as far as hoping this gets fixed = I have no clue, but I hope between your and my postings here, someone will identify the source issue, and hopefully come out with a fix, or at least a work-around - but I have no clue when that'll be... Thanks MediaWiki team for the good times and all the wiki pages you served for me over the years - it was a long run, but with a heavy heart I must move on! I will continue to watch this page, and hope for a fix, time will tell, I just hope it's not too far away. Peace --Agentdcooper 09:14, 1 July 2008 (UTC)

Answer
Hello, I have tried to read through the bunch of information you gave. My Opinion: Your Lucene is running right, because when you "manually" query your Lucene Listener it returns you "MAIN_PAGE". So my advice is that you don't change the conf of your Lucene. I think the problem is inside your Extension MWSearch. Have you tried to run a fresh 1.12 Installation only with the MWSearch Extension to access your existing Lucene Daemon? I think you need nothing more than your MWSearch Extension and no "ExtensionFunctions". Later on I am trying to get more information about your Problem. I am focusing on your MWSearch Extension.... --Bisato 06:24, 30 June 2008 (UTC)
 * Thanks for your suggestion, but unfortunately I have tried that many times in the past when initially troubleshooting I have alot of info posted HERE. Now that page doesn't indicate that I was running 1.12.0, but I can assure you I have tried it many times with just Extension:MWSearch, I get the exact same results I am posting on this page, and the previous page. I have always obtained these results, using MWSearch no matter what OS version I run (on Slackware 11.x thru current stable version 12.1) I have tried this out on multiple Slackware servers, all obtaining the SAME results. I think yr right, I think Lucene-search2 is working properly, just MWSearch is not passing the proper parameters to Mediawiki. I'd love to troubleshoot and fix this, but I just don't know where to go next. Do you have any other suggestions per chance?
 * Thanks to all who've helped! I hope we can get to the bottom of this! peace --Agentdcooper 10:26, 30 June 2008 (UTC)