Extension talk:Zend Search Lucene for MediaWiki

From MediaWiki.org
Jump to: navigation, search
Start a new discussion
First page
First page
Previous page
Previous page
Last page
Last page

Would you please correct the link "In depth description of Zend Search Lucene for MediaWiki (German)"?! I hope to find some hints to configure a windows-server ... thank you!

62.225.157.13012:24, 17 May 2013

Fixed thank you...my changes will be displayed to readers once an authorized user accepts them.

Steviex2 (talk)12:38, 17 May 2013
 

No search result. <Solved>

Edited by author.
Last edit: 03:20, 29 April 2013

Dear,

When i try to search for a topic that i've created to test the search function, i get
no search results. I tried changing the installation path and path to my mediawiki
and run /usr/bin/php -c /etc/php.ini /usr/local/search/ZSearch/PslZendSearchLuceneIndexer.php
and end up with this 2 error
PHP Notice: Undefined variable: wikisArray in /usr/local/search/ZSearch/PslZendSearchLuceneIndexer.php on line 34
PHP Warning: Invalid argument supplied for foreach() in /usr/local/search/ZSearch/PslZendSearchLuceneIndexer.php on line 34
line 34: foreach($wikisArray as $key => $val): which i think is the main cause of this problem.
I tried changing paths in PslZendSearchLuceneIndexerConfig.php but cant seems to solve this problem.
Anyone got any hint for me as to how to fix this
or anyone have similar problem previously that can point me to the right direction?

Thanks in advance.

Regards Nick

Nick1092 (talk)02:44, 29 April 2013

wikisArray is a config var where you define your settings/Wikis. You should fill it up with your individual data. Scroll the file to see where it is set. "ps aux|grep apache" retrieves your current apache user on Linux...but I think this is not the problem.

cheers.

Steviex2 (talk)07:11, 29 April 2013

Dear Steviex,

at the moment i just assume that i filled in the wrong details? Here's what i got at the moment on the config file

/* This part is for you! Configure multiple Wiki-instances if needed, see examples below */
 
    $GuiFlag                            = 0;
 
    $wikisArray[0]['xmlSource']         = "/var/www/zend/sources/internal__current.xml";
    $wikisArray[0]['indexName']         = "my_wiki";  
    $wikisArray[0]['maintenanceScript'] = "/var/www/mediawiki/maintenance/dumpBackup.php";
    $wikisArray[0]['mediaDir']          = "/var/www/mediawiki/images/";
 
    #$wikisArray[1]['xmlSource']         = "/var/www/zend/sources/sysdoc_current.xml";
    #$wikisArray[1]['indexName']         = "wikidb_sysdoc";
    #$wikisArray[1]['maintenanceScript'] = "/var/www/mediawiki/maintenance/dumpBackup.php";
    #$wikisArray[1]['mediaDir']          = "/var/www/mediawiki/images/"
; 
    #[...]
 
endif;
 
@preg_match_all("/(Windows)(.*?)/", $_SERVER['OS'], $matched, PREG_SET_ORDER);
/* an index dir above web root */
$indexDirName                   = "psl_search_indexes";
$PhpExecutionStringUnix         = "/usr/bin/php -c /etc/php.ini";
$PhpExecutionStringWindows      = "c:\\xampp\\php\\php.exe ";
$email                          = "";
/* file formats which will be indexed */
$additionalFileFormatsArray     = array('pdf','docx','xlsx','pptx','sql','vnd','txt','xml','xmlx','csv');
 
if(count($matched) > 0 ):
/* modify this to fit your needs, if you are on windows */
    $webServerUser              = "";
     $webServerUserGroup         = "";
    $zendFrameworkLibraryPath   = "C:\\xampp\\htdocs/ZF/library";
    $zendLogPath                = "C:\\xampp\\htdocs\\".$indexDirName."\\";
    $applicationPath            = "C:\\xampp\\htdocs";
else:
/* modify this to fit your needs, if you are on unix */
    $webServerUser              = "apache";
    $webServerUserGroup         = "apache";
    $zendFrameworkLibraryPath   = "/var/www/zend/ZendFramework-1.11.15/library";
    $zendLogPath                = "/var/www/zend/".$indexDirName."/";
    $applicationPath            = "/var/www/zend/";
endif;
?>

I assume the indexname means my database name?

Nick1092 (talk)07:52, 29 April 2013
 

Edit : Double posted

Nick1092 (talk)07:52, 29 April 2013

Hello,

yes indexName reflects the Lucene index name as well as the db-name. Keep in mind the first array index "0", "1" and so on reflects a dedicated Wiki, so if you actually need it for one Wiki you have to fill only $wikiArray[0]. If you need ZSL for 10 Wikis you would have to fill wikiArray[0] untill wikiArray[9]. Do not use wikiArray[1] with wikidb_sysdoc etc...its only dummy configuration data to show that the engine could be used with multiple Wiki instances.

cheers.

Steviex2 (talk)09:04, 29 April 2013

Dear,

Yeh already commented the second wikiarray out.
I created an empty directory /var/www/zend/sources for the internal_current.xml
entered my db name, the correct dumpbackup.php and images path .
cant find what's wrong with it. Doubt it's directory permission problem.

However, when i do it through PslZslAdmin, a sysdoc_current.xml is created in the sources dir.

Regards

Nick1092 (talk)03:12, 30 April 2013

Hi,

"sysdoc_current" should not have been created because its the second dummy data as you see above. As I remeber you have to configure in LocalSettings.php too for PslZslAdmin. Since the prefix in the name reflects your wiki its typically the name of a directory or If you want your wgScriptPath. So sysdoc_current would be named meidawiki_current in your environment. Or my_wiki_current...I m a little bit confused. Could you provide the full path to your Wiki and the real DB-Name. May be its an option to give me system access to see what happens. You can reach me over the mentioned service pages.

cheers.

Steviex2 (talk)23:42, 1 May 2013
 
 
 
 
 

Blank page when searching

SOLVED: Problem was that the directorys were owned by the user apache instead of nobody, which was actually the webserver-user.

Hello everyone,

I´ve installed ZF and the Zendsearchlucene for Mediawiki extension.

The index-process runs smoothly and without an error.

Problem is, that I get a blank page as soon as I hit the search button. No error-message and no hint what´s wrong.

Here´s my config:

Webroot is /opt/lampp/htdocs (+/mediawiki as symlink to mw installation path /opt/lampp/htdocs/mediawiki_1.17.0)

ZendF resides in /opt/lampp/zend/

The relevant parts of the config are:

Index Config

$GuiFlag = 0;


$wikisArray[0]['xmlSource'] = "/opt/lampp/zend/sources/internal_current.xml";
$wikisArray[0]['indexName'] = "wikidb_internal";
$wikisArray[0]['maintenanceScript'] = "/opt/lampp/htdocs/mediawiki/maintenance/dumpBackup.php";
$wikisArray[0]['mediaDir'] = "/opt/lampp/htdocs/mediawiki/images";// maybe httpdocs/images/ if img_auth.php not in use



#$wikisArray[1]['xmlSource'] = "/opt/lampp/zend/sources/sysdoc_current.xml";
#$wikisArray[1]['indexName'] = "wikidb_sysdoc";
#$wikisArray[1]['maintenanceScript'] = "/opt/lampp/htdocs/mediawiki/maintenance/dumpBackup.php";
#$wikisArray[1]['mediaDir'] = "/opt/lampp/htdocs/mediawiki/images/";// maybe httpdocs/images/ if img_auth.php not in use

#[...]

endif;



@preg_match_all("/(Windows)(.*?)/", $_SERVER['OS'], $matched, PREG_SET_ORDER);




/* an index dir above web root */




        $indexDirName = "psl_search_indexes";
        $PhpExecutionStringUnix = "/opt/lampp/bin/php -c /opt/lampp/etc/php.ini";
        $PhpExecutionStringWindows = "c:\\xampp\\php\\php.exe ";


        $email = [mailto:info@my.reporting-mailadress.ork info@my.reporting-mailadress.ork];




/* file formats which will be indexed */
$additionalFileFormatsArray = array('pdf','docx','xlsx','pptx','sql','vnd','txt','xml','xmlx','csv');

if(count($matched) > 0 ):
/* modify this to fit your needs, if you are on windows */




        $webServerUser = "";
        $webServerUserGroup = "";
        $zendFrameworkLibraryPath = "C:\\xampp\\htdocs/ZF/library";
        $zendLogPath = "C:\\xampp\\htdocs\\".$indexDirName."\\";
        $applicationPath = "C:\\xampp\\htdocs";




else:




/* modify this to fit your needs, if you are on unix */




        $webServerUser = "apache";
        $webServerUserGroup = "apache";
        $zendFrameworkLibraryPath = "/opt/lampp/zend/library";
        $zendLogPath = "/opt/lampp/zend/".$indexDirName."/";
        $applicationPath = "/opt/lampp/zend/";
endif;

Local-settings from Mediawiki

/* Configuration Zend Search Lucene for MediaWiki - Start */
$PslDomainDir = "internal";
$PslPhpExecutionStringUnix = "/opt/lampp/bin/php -c /opt/lampp/etc/php.ini ";
$PslMaintenancePath = "/opt/lampp/htdocs/mediawiki/maintenance/";
$PslXmlPath = "/opt/lampp/zend/sources/".$PslDomainDir."_current.xml";


$wgPslZslAdminUseAutoReIndex = false;
$wgPslZslAdminDefaultEmail = "<your email address>";
$wgPslZslAdminDumpString = $PslPhpExecutionStringUnix.$PslMaintenancePath."dumpBackup.php --current --quiet --uploads > ".$PslXmlPath;
$wgPslZslAdminMediaDir = "/opt/lampp/htdocs/mediawiki/images/";
$wgPslZslAdminReIndexString = $PslPhpExecutionStringUnix."/opt/lampp/zend/PslZendSearchLuceneIndexer.php ".$PslXmlPath." wikidb_".$PslDomainDir." ". $PslMaintenancePath."dumpBackup.php";



require_once( "$IP/extensions/PslZslAdmin/PslZslAdmin.php");



$wgSearchType = 'PslZendSearchLucene';
$wgPslEnableSuggestions = true;//enables suggestions
$wgPslEnableStopWords = false;//enables stop words
$wgPslStopWords = array('aber','als','am','an');
$wgPslImagePath = [http://172.23.101.63/mediawiki/extensions/PslZendSearchLucene/ http://172.23.101.63/mediawiki/extensions/PslZendSearchLucene/];



$wgPslWikiUrl = "http://172.23.101.63/mediawiki-1.17.0/index.php/";
$wgPslEntriesPerPage = 20;
$wgPslUtf8DecodeResults = false;//utf8-hint for related display issues, (play around with this if needed)
$wgPslIndexDir = "/opt/lampp/zend/psl_search_indexes/wikidb_".$PslDomainDir;
$wgPslZendLibraryDir = "/opt/lampp/zend/library/";
$wgPslEnablePopularSearches = true;//requires table-create rights for MediaWikis db-account
$wgPslPopularSearchesHistory = 365;//data remains 365 days
$wgPslProtectPopularSearches = false;//
$wgPslHighlightColor = "#ff6900";
$wgPslEnabaleDebugMode = false;//debug mode
$wgPslEnableSuggestions = true;//enables suggestions
$wgPslEnableFileSearch = true;//enables file search
$wgPslEnablePsIpTracking = false;//enables ip tracking for geo lacation services etc. (currently not implemented)
$wgPslEnableAnonKey = true;//anonymous key for science
$wgPslHistoryEntries = 30;//history entries per page
$wgPslHistoryMiniStat = true;



$wgHiddenPrefs[] = 'searchlimit';//this entries are disabling no more needed user preferences of the old/default search
$wgHiddenPrefs[] = 'contextlines';
$wgHiddenPrefs[] = 'contextchars';
$wgHiddenPrefs[] = 'disablesuggest';
$wgHiddenPrefs[] = 'searcheverything';
$wgHiddenPrefs[] = 'searchnamespaces';
$wgPslEnableUserInHistory = false;//enhanced knowledge management feature could tackle your country specific law!
require_once( "$IP/extensions/PslZendSearchLucene/PslZendSearchLucene.php");
/* Configuration Zend Search Lucene for MediaWiki - End */

Anyone an idea? I already tried several combinations of paths but none would work.

Greetings, and thanks in advance...

F

141.84.149.1008:03, 20 October 2011

EDIT: Sorry, i think your problem is different, miss read your problem, mine is no search result but i think
yours is no access to folder .

Dear,

How did you fix your problem? I think i got the same problem as you
There's no search result even if i search for a test topic inside my mediawiki
I've already set webuser and webusergroup to apache, which is the directory owner of my wiki at the moment
or is the webuser.. my database admin id?

Thanks in advance.

Regards,
Nick

Nick1092 (talk)02:20, 29 April 2013
 

Fatal Error calling mb_strtolower()

Dear,

I got this fatal error
Fatal error: Call to undefined function mb_strtolower() in /var/www/mediawiki1.20.4/extensions/PslZendSearchLucene/PslZendSearchLucene_body.php on line 975.
I check that fine, it's mb_strtolower($wgRequest ->getText('category'));
anyone got this problem before?
Any help is welcome, thanks in advance.

Mediawiki 1.20.4

Nick1092 (talk)08:21, 26 April 2013

AFAIK, "mb_strtolower()" depends on an extra PHP library....

http://php.net/manual/en/mbstring.installation.php

Alternative you could try to substitute the function with "strtolower()". I m afraid there are some more lines which uses this PHP library, so the savest way is to configure your PHP accordingly.

Keep in mind that this dependency is already mentioned in the requirements (mbstring).

cheers.

Steviex2 (talk)17:37, 26 April 2013

thanks for the help Steviex, i'll try to configure my php for this. My mistake for missing this mbstring enable line under requirements.

Nick1092 (talk)01:19, 29 April 2013
 
 

Indexer error when reading internal_current.xml

Hi, the indexer is logging an error when attempting to read the contents of internal_current.xml. This results in zero documents indexed. I'm running MW 1.19.2/PHP 5.2.8/MySQL 5.1.37/W2K3. How can I resolve this?

D:\>PHP\php.exe -f d:\Wwwroot\zend\PslZendSearchLuceneIndexer.php
2013-03-27T11:14:52-07:00 INFO (6): Indexer initializing ...
2013-03-27T11:14:52-07:00 INFO (6): Start madxwikidb-1_19_2 at: 27.03.2013 11:14:52
2013-03-27T11:14:52-07:00 INFO (6): incrementUpdate: 0
2013-03-27T11:14:58-07:00 INFO (6): LuceneIndexer Status-Message! SUCCESS dumpXML() D:\PHP\php.exe D:\Wwwroot\madxwiki-
1.19.2\maintenance\dumpBackup.php --current --quiet --uploads > D:\Wwwroot\zend\sources\internal_current.xml----Status -> 255-->madxwikidb-1_19_2
2013-03-27T11:14:59-07:00 INFO (6): LuceneIndexer Status-Message! SUCCESS: chmod command: 1:D:\Wwwroot\zend\sources
\internal_current.xml-->madxwikidb-1_19_2
2013-03-27T11:14:59-07:00 INFO (6): Created new index in D:\Wwwroot\psl_search_indexes\madxwikidb-1_19_2\data\cache\index
2013-03-27T11:15:01-07:00 INFO (6): LuceneIndexer Error-Message! ERROR XML-Source file_get_contents : D:\Wwwroot\zend\sources
\internal_current.xml-->madxwikidb-1_19_2
2013-03-27T11:15:01-07:00 INFO (6): Optimizing index: madxwikidb-1_19_2
2013-03-27T11:15:01-07:00 INFO (6): Iterator over 0 documents (HTML)
2013-03-27T11:15:01-07:00 INFO (6): Done. Index now contains 0 documents
2013-03-27T11:15:01-07:00 INFO (6): Indexing complete
2013-03-27T11:15:01-07:00 INFO (6): Wiki-Start madxwikidb-1_19_2 at: 27.03.2013 11:14:52
2013-03-27T11:15:01-07:00 INFO (6): Wiki-End madxwikidb-1_19_2 at: 27.03.2013 11:15:01
2013-03-27T11:15:01-07:00 INFO (6): Indexer initializing ...
2013-03-27T11:15:01-07:00 INFO (6): Start madxwikidb-1_19_2 at: 27.03.2013 11:15:01
2013-03-27T11:15:01-07:00 INFO (6): incrementUpdate: 0
2013-03-27T11:15:02-07:00 INFO (6): LuceneIndexer Status-Message! SUCCESS dumpXML() D:\PHP\php.exe D:\Wwwroot\madxwiki-1.19.2\maintenance\dumpBackup.php --current --quiet --uploads > D:\Wwwroot\zend\sources\sysdoc_current.xml----Status -> 255-->madxwikidb-1_19_2
2013-03-27T11:15:03-07:00 INFO (6): LuceneIndexer Status-Message! SUCCESS: chmod command: 1:D:\Wwwroot\zend\sources\sysdoc_current.xml-->madxwikidb-1_19_2
2013-03-27T11:15:03-07:00 INFO (6): Open existing index in D:\Wwwroot\psl_search_indexes\madxwikidb-1_19_2\data\cache\index
2013-03-27T11:15:05-07:00 INFO (6): '''LuceneIndexer Error-Message! ERROR XML-Source file_get_contents : D:\Wwwroot\zend\sources\sysdoc_current.xml-->madxwikidb-1_19_2'''
2013-03-27T11:15:05-07:00 INFO (6): Optimizing index: madxwikidb-1_19_2
2013-03-27T11:15:05-07:00 INFO (6): Iterator over 0 documents (HTML)
2013-03-27T11:15:05-07:00 INFO (6): Done. Index now contains 0 documents
2013-03-27T11:15:05-07:00 INFO (6): Indexing complete
2013-03-27T11:15:05-07:00 INFO (6): Wiki-Start madxwikidb-1_19_2 at: 27.03.2013 11:15:01
2013-03-27T11:15:05-07:00 INFO (6): Wiki-End madxwikidb-1_19_2 at: 27.03.2013 11:15:05
MadX (talk)21:01, 27 March 2013

You are dealing with dummy data. The name of the XML files should reflect real world dirnames. You also need only one XML-source per Wiki.

Steviex2 (talk)22:38, 15 April 2013
 

Question about pslpopularsearches

Hi all, I received the following erro message "1146: Table 'wikidb.mw_mw_pslpopularsearches' doesn't exist"

Anyone have an idea?

Steviex218:40, 10 May 2011

Seems, you have a problem with the db-table prefix. Could you please post the complete error message and the version of ZSL you are using?

Steviex218:42, 10 May 2011

Hello, think there is a prefix-problem in my wiki too (ZSL 2.0):

Es ist ein Datenbankfehler aufgetreten. Der Grund kann ein Programmierfehler sein.
Die letzte Datenbankabfrage lautete: 
INSERT IGNORE INTO `wikiwikipslpopularsearches` (searchcon,results,success,triggertime,
user,ip,rawquery,score,namespace,pids,page,category,rating,pageurl,pagpage,sk) 
VALUES ('body_and_title','55','1','2011-05-16 14:13:44',,,'Buch','4.41565438044',,',
840,864,1014,1541,319,1474,842,905,1156,1211,897,715,1548,366,811,421,349,305,1072,426,
480,307,896,846,514,634,306,1551,311,1221,1349,1443,415,371,927,407,327,1248,1402,340,
1530,889,890,360,554,309,145,1470,775,432,1391,1465,304,985,1469',,,,,,
'465e8da6fb92d8a7b7bb24687b89517b')
aus der Funktion „PslZendSearchLuceneDbActions“. Die Datenbank meldete den Fehler „1146:
Table 'wiki.wikiwikipslpopularsearches' doesn't exist (localhost)“.

Thank you.

Agoerlt08:19, 18 May 2011

Same here - what's the fix?

Pjtait (talk)01:19, 20 April 2012

Hi,

as mentioned... check the db-prefix setup in your LocalSettings.php.

Steviex2 (talk)02:23, 20 April 2012

db-prefix is ok in LocalSettings.php no solutions yet ?

2001:770:1A4:0:596:6D9D:EF67:5F5912:10, 4 March 2013
 
 
 
 
 

require_once(Zend/Search/Lucene.php): failed to open stream

Warning: require_once(Zend/Search/Lucene.php): failed to open stream: No such file or directory in D:\Wwwroot\PslZendSearchLuceneIndexer.php on line 140

The documentation provides example of how to get around this error on Apache. Can we get an example for IIS?

MadX (talk)07:57, 7 February 2013

This is a very low level PHP-include-error which should be handled on both systems nearly the same way. You could check the incldue_path-settings of PHP.

Steviex2 (talk)08:28, 7 February 2013
 

Enhanced, more admin friendly Instructions (mediawiki 1.17.0)

On one hand, i finally got Zend Lucene to work, and on mediawiki 1.17.0, but not without having virtually any documentation that could help.

MediaWiki 1.17.0

PHP 5.3.6 (apache2handler)

MySQL 5.2.8-MariaDB-log


Step 1 - Install / Download Zend Framework Download Zend Framework. Unpack and copy the contents of the download file to a webserver folder (commonly not below web root). Zend Framework install is done! You're NOT exactly done. Here, I had to create a directory called 'zend', above web root (i.e. /var/www/zend) where i extracted the contents of the tar ball.

Step 2 - Configure Zend Search Lucene for MediaWiki Download and extract the extensions PslZslAdmin and PslZendSearchLucene to your Wiki(s) extension directory. Move the files PslZendSearchLuceneIndexer.php and PslZendSearchLuceneIndexerConfig.php to a server directory above web root. Edit the marked parts of the file PslZendSearchLuceneIndexerConfig.php as described in it.

This is where it gets tricky. The config file (PslZendSearchLuceneIndexerConfig.php) unfortunately lacks the proper comments to help any admin intuitively configure Zend in any reasonable ammount of time. Here are some suggestions i would recommend to any admin out there wanting to install Lucene:

1.) In my case, i didn't have an xml repository for my db dumps. So I created a 'source' folder where the xml dumps will be housed. I just created a 'sources' directory in <full path of Zend installation>/sources

2.) Here's how i would've labeled the parameters in the config file:


Instead of:

$wikisArray[0]['xmlSource']         = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/internal/internal_current.xml";

I would've put:

$wikisArray[0]['xmlSource']         = "<full path of xml dumps>/internal_current.xml";

In my case, again, i created a directory specifically designated for these dumps, which was /var/www/zend/sources


Instead of:

$wikisArray[0]['maintenanceScript'] = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/internal/httpdocs/wiki/maintenance/dumpBackup.php";
$wikisArray[0]['mediaDir']          = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/internal/public/";// maybe httpdocs/images/ if img_auth.php not in use


I would've put:

$wikisArray[0]['maintenanceScript'] = "<full path of mediawiki installation>/maintenance/dumpBackup.php";
$wikisArray[0]['mediaDir']          = "<full path of mediawiki installation>/images/";

In my case, <full path of mediawiki installation> = /var/www/html/wiki/mediawiki-1.17.0/


Instead of:

$wikisArray[1]['xmlSource']         = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/sysdoc/sysdoc_current.xml";

I would've put:

$wikisArray[1]['xmlSource']         = "<full path of xml dumps>/sysdoc_current.xml";

In my case, again, i created a directory specifically designated for these dumps, which was /var/www/zend/sources


Instead of

$wikisArray[1]['maintenanceScript'] = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/sysdoc/httpdocs/maintenance/dumpBackup.php";
$wikisArray[1]['mediaDir']          = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/sysdoc/public/";// maybe httpdocs/images/ if img_auth.php not in use


I would've put:

$wikisArray[1]['maintenanceScript'] = "<full path of mediawiki installation>/maintenance/dumpBackup.php";
$wikisArray[1]['mediaDir']          = "<full path of mediawiki installation>/images/";


Instead of:

$PhpExecutionStringUnix         = "/usr/bin/php -c /etc/php5/cli/php.ini";

I would've put:

$PhpExecutionStringUnix         = "/usr/bin/php -c /<location of php.ini file>/php.ini";

In my case <location of php.ini file> = /etc/


Instead of:

$webServerUser              = "www-data";
$webServerUserGroup         = "psaserv";
$zendFrameworkLibraryPath   = "/PSL_ADD_ONS/ZF/library";
$zendLogPath                = "/PSL_ADD_ONS/".$indexDirName."/";
$applicationPath            = "/PSL_ADD_ONS";


I would've put:

$webServerUser              = "<your web server user>";
$webServerUserGroup         = "<your web server group>";
$zendFrameworkLibraryPath   = "/<installation path of Zend>/ZendFramework-1.11.10/library";
$zendLogPath                = "/<installation path of Zend>/".$indexDirName."/";
$applicationPath            = "/<installation path of Zend>/";


3.) Here's how i would've labeled the config parameters for LocalSettings.php

/* Configuration Zend Search Lucene for MediaWiki - Start */
$PslDomainDir                   = "sysdoc";
$PslPhpExecutionStringUnix      = "/usr/bin/php -c /<full path to php.ini file>/php.ini ";
$PslMaintenancePath             = "/<full path to mediawiki installation>/maintenance/";
$PslXmlPath                     = "/<full path to xml dumps or sources>/".$PslDomainDir."_current.xml";

$wgPslZslAdminUseAutoReIndex    = false;
$wgPslZslAdminDefaultEmail      = "<your email address>";
$wgPslZslAdminDumpString        = $PslPhpExecutionStringUnix.$PslMaintenancePath."dumpBackup.php --current --quiet --uploads > ".$PslXmlPath;
$wgPslZslAdminMediaDir          = "<full path of mediawiki installation ir directory where you store uploaded docs>/images/";
$wgPslZslAdminReIndexString     = $PslPhpExecutionStringUnix."/<full path to Zend Installation>/PslZendSearchLuceneIndexer.php ".$PslXmlPath." wikidb_".$PslDomainDir." ".  $PslMaintenancePath."dumpBackup.php"; 

require_once( "$IP/extensions/PslZslAdmin/PslZslAdmin.php");

$wgSearchType                      = 'PslZendSearchLucene';
$wgPslEnableSuggestions            = true;//enables suggestions
$wgPslEnableStopWords              = false;//enables stop words
$wgPslStopWords                    = array('aber','als','am','an');
$wgPslImagePath                    = "http://<wiki domain or ip address of wiki server>/extensions/PslZendSearchLucene/";

$wgPslWikiUrl                      = "http://<wiki url>/mediawiki-1.17.0/index.php/";
$wgPslEntriesPerPage               = 20;
$wgPslUtf8DecodeResults            = false;//utf8-hint for related display issues, (play around with this if needed)
$wgPslIndexDir                     = "/<full path to Zend Installation>/psl_search_indexes/wikidb_".$PslDomainDir;
$wgPslZendLibraryDir               = "/<full path to Zend Installation>/ZendFramework-1.11.10/library/";
$wgPslEnablePopularSearches        = true;//requires table-create rights for MediaWikis db-account
$wgPslPopularSearchesHistory       = 365;//data remains 365 days
$wgPslProtectPopularSearches       = false;//
$wgPslHighlightColor               = "#ff6900";
$wgPslEnabaleDebugMode             = false;//debug mode
$wgPslEnableSuggestions            = true;//enables suggestions
$wgPslEnableFileSearch             = true;//enables file search
$wgPslEnablePsIpTracking           = false;//enables ip tracking for geo lacation services etc. (currently not implemented)
$wgPslEnableAnonKey                = true;//anonymous key for science
$wgPslHistoryEntries               = 30;//history entries per page
$wgPslHistoryMiniStat              = true;

$wgHiddenPrefs[] = 'searchlimit';//this entries are disabling no more needed user preferences of the old/default search
$wgHiddenPrefs[] = 'contextlines';
$wgHiddenPrefs[] = 'contextchars';
$wgHiddenPrefs[] = 'disablesuggest';
$wgHiddenPrefs[] = 'searcheverything';
$wgHiddenPrefs[] = 'searchnamespaces';
$wgPslEnableUserInHistory          = false;//enhanced knowledge management feature could tackle your country specific law!
require_once( "$IP/extensions/PslZendSearchLucene/PslZendSearchLucene.php");
/* Configuration Zend Search Lucene for MediaWiki - End */


4.) Finally, make sure you check your permissions! Your xml dump/source directory must be owned by the web server user and in the web server group. In my case, apache:apache. A simply chown -R apache:apache . in the Zend installation directory should do the trick.

Hope this helps!

Ucananduwill19:12, 22 September 2011

Thank you for your contributions. Nice to heare it's running on 1.17. We have several hundred downloads and less then 20 questions about setup till now. So this wasn't on my toDo-list :-).

Steviex220:31, 22 September 2011

Thank you for taking the time for these instructions, they really helped me! It's an awesome extension but so confusing to set up at first glance. Now it breaks my template, but that's a minor thing to fix! :)

21:23, 4 October 2011

1+

92.200.30.11314:02, 12 October 2012
 
 

The above very good for admins installing using *nix, I've been struggling under Windows to get this sorted, is there a Windows version of the above?

203.48.50.18003:11, 20 December 2012

The orig. documentation in the package should have win-examples too.

86.56.60.13613:07, 20 December 2012
 
 

It works in MediaWiki 1.20.2

It works in MediaWiki 1.20.2, but I get an error message when I run PslZendSearchLuceneIndexer.php on bash:

Warning: chgrp(): Unable to find gid for somewiki in /home/content/30/9164430/html/zend/PslZendSearchLuceneIndexer.php on line 296
2012-12-18T16:14:47-07:00 INFO (6): LuceneIndexer Error-Message! ERROR: chgrp command: :/home/content/30/9164430/html/zend/psl_search_indexes/wikidb_sysdoc:somewiki-->wikidb_sysdoc

What is it mean? How can I fix this?

| Jaider Msg00:20, 19 December 2012

Hi,

If "wikidb_sysdoc" is'nt one of your wiki db's, you are playing with some dummy data. Refere to your configuration and replace it with your real world data.

86.56.60.13609:32, 19 December 2012

Thanks

| Jaider Msg13:25, 19 December 2012
 
 

Link to page containing document

I am maintaining a wiki on a local intranet. Could you tell me if Zend search will show the link to the wiki page which contains an indexed document or if it only links to the indexed document.

Our current search will find a phrase in 'aaa.pdf' but when you click the link, you can only open 'aaa.pdf', you cannot see where 'aaa.pdf' is linked on the wiki.

Thanks.

203.5.217.304:18, 26 April 2012

It should list the file link and pages containing "aaa.pdf". Further more it should allow to find phrases inside a pdf-document as long as you extended the code a little bit.

c u stevie

Steviex2 (talk)15:50, 30 April 2012

Can you elaborate or provide links to further information on how to extend the code to search inside pdf documents? This is the main reason I would like to use Zend Search Lucene for MediaWiki.

Thanks

Rpsteiner (talk)22:09, 24 May 2012

Hi Rpsteiner,

Open file PslZendSearchLuceneIndexer.php, go to line 530 "...We will do this...". This should be the location to index pdf-content. Unfortunately there was no further sponsor to let this happen, but you could do it by yourself If you are coder. The keyword here is XPDF (name of an external linux library). This task should be easy- but would need some hours. Whish you good luck...may you can contribute the necessary code portion here. I would integrate it in a next release (providing an exception for windows users etc.).

c u stevie

Steviex2 (talk)22:35, 24 May 2012
 
 
 

Suggestions not working

Hello,

Suggestions are not working. Do you have any idea? In LocalSettings.php it is enabled.

212.185.65.9107:56, 31 August 2011

It should be an issue of your environment. Usually it works fine.

Steviex208:06, 31 August 2011

Do you have any idea what could be missing? Will i need a special php libary or something like that?

212.185.65.9109:12, 31 August 2011
Edited by author.
Last edit: 06:40, 1 September 2011

Do you know that suggestions could have been configured by users? At first you should play with this settings and test every mode. As I konw there are no additional PHP-modules necessary. Could you please tell me something about your MW-Version?

Steviex217:36, 31 August 2011

Ah i didn't configure it for my user...sorry. Then my Problem is 'fixed'. Thx for your help.

212.185.65.9106:16, 1 September 2011

Hi, could you please clarify this for me? By "suggestions" you mean that drop-down list that appears bellow the search box as we type, or is it a list of suggestions presented after we click on the search button (something like a "Did you mean?"). Thanks

Capmo (talk)03:08, 24 March 2012
 
 
 
 
 

Search multiple wikis

In the documentation you list unlimited wiki instances as a feature, can I run a search across all index and return results for all the wikis?

Ashex05:33, 10 February 2012

This could have been done by modifying the source code. "Unlimited instances" is first related to the indexer, meaning you have a server with several MediaWiki-installations and whish to index all with a single nightly cron job.

80.187.107.301:00, 11 February 2012

Could you possibly elaborate on that? I have a Wiki family (single server running multiple wikis, each has a separate database) and would like to be able to search across them.

Ashex (talk)21:50, 24 February 2012

This is no standard feature of ZSL and would require approximately several days of investigations and development. Unfortunately I can't provide this on a non profit base. This could also be a preformance related issue and may fail. We currently have no experience on doing so. If all this doesn't matter you could require a quote on www.wiki-service.biz. You could also check an alternative enterprise search engine like Solr.

c u stevie

Steviex2 (talk)22:16, 24 February 2012

If it's not a standard feature and would require additional development of the extension, then why is it listed as a feature?

Ashex (talk)01:36, 26 February 2012

Unlimited instances is related to the indexer!

Steviex2 (talk)17:15, 26 February 2012
 
 
 
 
 

XmlDump won't work on Windows

Hello,

I installed the extension yesterday and tried to have it working but I always get the same error message when running the indexer:

2011-11-09T18:29:52+01:00 INFO (6): LuceneIndexer Error-Message! ERROR dumpXML() 
"C:\Program Files\EasyPHP 3.0\php\php.exe" -c "C:\Program Files\EasyPHP 3.0\conf_files\php.ini" 
D:/Wikis/mediawiki-1.17.0/maintenance/dumpBackup.php --current --quiet --uploads 
> D:/Wikis/search-engine/psl_sources/af_current.xml---Status -> 1

However, when I copy/paste and run myself the command above in the same cmd.exe where I ran the indexer (so in the same conditions normally), this works fine and my af_current.xml is correctly filled with the dump.

Any idea?

(I'm on mediaWiki 1.17 but this has no link with the above mentioned stuff, I think)

193.56.136.20007:14, 10 November 2011

Hi,

strange, never heard about it before.

dumpXML should be a funktion in the code. What you could do is, to go there...see what the funktion is trying to do and print out anny information available at this point (pathes etc.). Also be sure that your WAMP System has the needed rights for this task.

Steviex212:58, 10 November 2011
 

Undefined Method User::getOptions()

When I try to open "Special Pages" I get an error:

Fatal error: Call to undefined method User::getOptions() in /users/lenjo/www/mediawiki-1.15.1/extensions/PslZendSearchLucene/PslZendSearchLucene_body.php on line 116

Where is the method defined? How can I fix that?

Edit: I found the class where getOptions is defined. But that does not solve the problem...

213.70.5.5709:31, 29 August 2011

You may have the wrong MW-version.

Steviex202:48, 30 August 2011

Hmm, does that not work with 1.15.1? I do not have the posibility to upgrade, since the MW is not mine...

213.70.5.5706:41, 30 August 2011

It was never tested against 1.15 and is declared as 1.16-Extension. I will test it against 1.17 soon.

Steviex207:34, 30 August 2011

oh shoot, I didn't see that. Well, then I have a problem... :-S

213.70.5.5708:33, 30 August 2011
 
 
 
 

PslZendSearchLuceneIndexerConfig.php problem

The PslZendSearchLuceneIndexerConfig.php file comes with "<?" at the start of the file, should probably be "<?php" as not all of us have short_open_tag enabled.

204.137.29.24320:34, 14 July 2011

Invalid argument supplied for foreach() in C:\xampp\htdocs\PslZend SearchLuceneIndexer.php on line 421

What to repair when indexing has this result? Thanks...

PHP Warning: Invalid argument supplied for foreach() in C:\xampp\htdocs\PslZend SearchLuceneIndexer.php on line 421

Warning: Invalid argument supplied for foreach() in C:\xampp\htdocs\PslZendSearc hLuceneIndexer.php on line 421 PHP Warning: Invalid argument supplied for foreach() in C:\xampp\htdocs\PslZend SearchLuceneIndexer.php on line 430

Warning: Invalid argument supplied for foreach() in C:\xampp\htdocs\PslZendSearc hLuceneIndexer.php on line 430

Rien Satori09:27, 21 June 2011

Hi, seems $domArr['mediawiki']['page'] is emty...this could mean you have no XML-Data to parse...your MediaWiki data extraction fails, or your Wiki has no pages.

Steviex212:22, 21 June 2011

Thank you so much for your reply. But the database dump was successful and internal_current.xml file was successfully created too. All other settings are almost exactly according to guide how to install it.

As I understand indexer is independent on LocalSettings.php, so shouldn't matter if I have some misconfiguration there.

Rien Satori23:32, 21 June 2011

Yes indexer is independent on LocalSettings.php.

Steviex217:13, 22 June 2011

Thanks for the reply. I have found that my wiki was dumping XML file with not allowed character at the beginning, after trimming it, indexing was OK.

I have multilingual smw wiki in English and Japanese in single database and on search results in Japanese I get error in Search results:

Warning: preg_replace() [function.preg-replace]: Compilation failed: unrecognized character after (? or (?- at offset 2 in /my/path/extensions/PslZendSearchLucene/PslZendSearchLucene_body.php on line 1525

anyway the Zend Extansion finds correct page but :

  • "Text" is not displayed under the search results for Japanese
  • foreign UTF8 characters (like other languages in English text) are displayed as ?
  • words inside Japanese sentence are not indexed (as there are no spaces between words in Japanese)

$wgPslUtf8DecodeResults just turns Japanese page names to ????

ad Mediawiki MW Search:

  • same result as mentioned upper
  • foreign UTF8 characters are displayed correct in normal MW Search...
  • words inside Japanese sentence are not indexed (default MW Search probably cannot deal with this)

I don't know what of the mentioned is my misconfiguration and what real troubles, just wanted to share overall result from testing by normal Mediawiki user (not a PHP expert).

Rien Satori06:24, 24 June 2011

Hi Satori,


I will try to reply accordingly from developers point of view- for you and following visitors. As I know Semantic MediaWiki (smw) is a complete "other peace of Software", or drastically modified MediaWiki. We never tested ZSL for MediaWiki against this branch. There is another comment describing problems with Japanese language...so we might can say ZSL is currently not ready for Japan ;-). But we recognized many downloads from other countries all over the world (without any bug postings) and use it with UTF8 in german language. So we could say it's a stable ZSL release accordingly to the requirements and test scenarios mentioned at the main page.

Steviex216:28, 24 June 2011
 
 
 
 
 

$wikisArray xmlSource xml file

In the following configuration of PslZendSearchLuceneIndexerConfig.php, how do you establish or figure out the path to the xmlsource file? do you need to create the xml file yourself? should it be found somewhere? I'm trying to get this installed on windows and I'm not having the best of luck.

$wikisArray[0]['xmlSource'] = "D:\xampp\htdocs\mediawiki\internal_current.xml"; $wikisArray[0]['indexName'] = "TESTWiki";

  $wikisArray[0]['maintenanceScript'] = "D:\xampp\htdocs\mediawiki\maintenance\dumpBackup.php";
  $wikisArray[0]['mediaDir']          = "D:\xampp\htdocs\mediawiki\images";// maybe httpdocs/images/
Daddyd20520:46, 9 June 2011

Hi there,

on win you should use double slashes...see post from Philipp

80.187.106.19514:32, 10 June 2011
 

How to configure PslZendSearchLuceneIndexerConfig.php

Hi,

could you explain to me how the paths within PslZendSearchLuceneIndexerConfig.php must be configured to get the whole thing working ? I made following configurations:

$GuiFlag = 0;

   $wikisArray[0]['xmlSource']         = "D:\xampp\htdocs\mediawiki\internal_current.xml";
   $wikisArray[0]['indexName']         = "TESTWiki";
   $wikisArray[0]['maintenanceScript'] = "D:\xampp\htdocs\mediawiki\maintenance\dumpBackup.php";
   $wikisArray[0]['mediaDir']          = "D:\xampp\htdocs\mediawiki\images";// maybe httpdocs/images/ if img_auth.php not in use

I get the following error, when I try to run the Indexer

C:\>D:\xampp\php\php.exe -f D:\xampp\htdocs\ZendFramework\PslZendSearchLuceneIndexer.php PHP Warning: require_once(Zend/Search/Lucene.php): failed to open stream: No such file or directory in D:\xampp\htdocs\ZendFramework\PslZendSearchLuceneIndexer.php on line 140

Warning: require_once(Zend/Search/Lucene.php): failed to open stream: No such file or directory in D:\xampp\htdocs\ZendFramework\PslZendSearchLuceneIndexer.php on line 140 PHP Fatal error: require_once(): Failed opening required 'Zend/Search/Lucene.php' (include_path='d: mpp\htdocs\ZendFramework\library') in D:\xampp\htdocs\ZendFramework\PslZendSearchLuceneIndexer.php on line 140

Fatal error: require_once(): Failed opening required 'Zend/Search/Lucene.php' (include_path='d: mpp\htdocs\ZendFramework\library') in D:\xampp\htdocs\ZendFramework\PslZendSearchLuceneIndexer.php on line 140

C:\>

Where is my mistake ?

Thank you

Philipp

194.172.26.13509:58, 23 May 2011

Found my mistake: $webServerUser = "";

   $webServerUserGroup         = "";
   $zendFrameworkLibraryPath   = "d:\\xampp\\htdocs\\ZendFramework\\library";
   $zendLogPath                = "d:\xampp\htdocs\\".$indexDirName."\\";
   $applicationPath            = "d:\xampp\htdocs";

Wrong path under $zendFrameworkLibraryPath = "d:\\xampp\\htdocs\\ZendFramework\\library"; Had to use double backslash.

Fixed it but I still get error-messages in my commandlinewindow-

194.172.26.13510:08, 23 May 2011

Found another missing "doublebackslash" in my config -> fixed it -> everything OK

Case closed

So long

Philipp

194.172.26.13510:20, 23 May 2011

never mind / Keine Ursache :-).

Steviex213:48, 23 May 2011
 
 
 

Special:PslZendSearchLucene, problems with display of Japanese/Chinese !

We are using English, Chinese and Japanese page content, and while the standard MW search and display works in the correct manner, Zend Search brings a error message and does not display the result in the correct character codding.

Warning: Cannot modify header information - headers already sent by (output started at
 ...extensions\PslZendSearchLucene\PslZendSearchLucene_body.php:452) 
MWJames05:46, 24 February 2011

Hi MWJames, it currently supports UTF-8, english and german (positive tested, see main page).

Steviex206:18, 24 February 2011
Edited by author.
Last edit: 15:40, 17 May 2011

@Steviex2: I am testing PslZendSearchLucene-Extension (Version 2.0) with MediaWiki 1.16.4, PHP 5.2.6 and MySQL 5.0.51. The Wiki's content is written in german language, but unfortunately I also receive the following error message (presumably based on coding issues):

Notice: iconv_strlen() [function.iconv-strlen]: Detected an illegal character in input string in {Server-Path}/ZendFramework-1.11.6/library/Zend/Search/Lucene/Search/QueryLexer.php on line 342

Is there any known solution for this problem yet? Grüße!

Agoerlt15:18, 17 May 2011
Edited by 0 users.
Last edit: 15:33, 17 May 2011

Hi,

could you please mention the extension version you are using and the full error message.

Steviex215:33, 17 May 2011

Its Version 2.0 of the extension and except for "{Server-Path}" the full error message (compare my post).

Agoerlt16:01, 17 May 2011
Edited by 2 users.
Last edit: 02:40, 18 May 2011

I' m not really sure- but believe that I read something about it while developing, maybe in conjunction with the used zend framework version or the iconv-configuration in php.ini. There are several german wikis in use with this extension (in production mode)- never heard about it again (sorry ;-)). You also could google this issue like me while developing. I'm sure there is an answer for this. Would be nice to leave a note after fixing.

c u

Steviex216:24, 17 May 2011
 
 
 
 
 

Search in Microsoft Office documents ( doc, docx, xls, xlsx etc. ) and pdf.

When can we have the possibility to search inside document ( all office format and pdf ) ? Any idea of the release date ? Additionnal question : when we upload a new document or make a new article, does the index is automatically up to date or do we need to launch a complete indexation with a job ?

84.37.20.4209:54, 8 March 2011

There is currently no timeline to implement new features mentioned as todo's on the extension main page, as long I don't receive urgent, further comercial development assignments. I will do this for sure, but please consider every single kind of file-extension needs a serious programming job in search engine land. Reindexing is needed by every search engine. You can do this incremental or full. The implementation of this could vary. As mentioned a common way to do this is triggering the indexer script by a cronjob. But theoretically this could be happen after every editing action (may be a little bit crazy).

UPDATE: All this features are realized with the next upcoming release (see main page announcements).

Steviex214:38, 8 March 2011
 
First page
First page
Previous page
Previous page
Last page
Last page