Extension talk:Zend Search Lucene for MediaWiki
|
|
This is a talk page, and it is meant to be used as an area for discussion pertaining to the attached content page. |
Contents
I am maintaining a wiki on a local intranet. Could you tell me if Zend search will show the link to the wiki page which contains an indexed document or if it only links to the indexed document.
Our current search will find a phrase in 'aaa.pdf' but when you click the link, you can only open 'aaa.pdf', you cannot see where 'aaa.pdf' is linked on the wiki.
Thanks.
It should list the file link and pages containing "aaa.pdf". Further more it should allow to find phrases inside a pdf-document as long as you extended the code a little bit.
c u stevie
Can you elaborate or provide links to further information on how to extend the code to search inside pdf documents? This is the main reason I would like to use Zend Search Lucene for MediaWiki.
Thanks
Hi Rpsteiner,
Open file PslZendSearchLuceneIndexer.php, go to line 530 "...We will do this...". This should be the location to index pdf-content. Unfortunately there was no further sponsor to let this happen, but you could do it by yourself If you are coder. The keyword here is XPDF (name of an external linux library). This task should be easy- but would need some hours. Whish you good luck...may you can contribute the necessary code portion here. I would integrate it in a next release (providing an exception for windows users etc.).
c u stevie
Hi all, I received the following erro message "1146: Table 'wikidb.mw_mw_pslpopularsearches' doesn't exist"
Anyone have an idea?
Seems, you have a problem with the db-table prefix. Could you please post the complete error message and the version of ZSL you are using?
Hello, think there is a prefix-problem in my wiki too (ZSL 2.0):
Es ist ein Datenbankfehler aufgetreten. Der Grund kann ein Programmierfehler sein.
Die letzte Datenbankabfrage lautete:
INSERT IGNORE INTO `wikiwikipslpopularsearches` (searchcon,results,success,triggertime,
user,ip,rawquery,score,namespace,pids,page,category,rating,pageurl,pagpage,sk)
VALUES ('body_and_title','55','1','2011-05-16 14:13:44',,,'Buch','4.41565438044',,',
840,864,1014,1541,319,1474,842,905,1156,1211,897,715,1548,366,811,421,349,305,1072,426,
480,307,896,846,514,634,306,1551,311,1221,1349,1443,415,371,927,407,327,1248,1402,340,
1530,889,890,360,554,309,145,1470,775,432,1391,1465,304,985,1469',,,,,,
'465e8da6fb92d8a7b7bb24687b89517b')
aus der Funktion „PslZendSearchLuceneDbActions“. Die Datenbank meldete den Fehler „1146:
Table 'wiki.wikiwikipslpopularsearches' doesn't exist (localhost)“.
Thank you.
Hello,
Suggestions are not working. Do you have any idea? In LocalSettings.php it is enabled.
It should be an issue of your environment. Usually it works fine.
Do you have any idea what could be missing? Will i need a special php libary or something like that?
Do you know that suggestions could have been configured by users? At first you should play with this settings and test every mode. As I konw there are no additional PHP-modules necessary. Could you please tell me something about your MW-Version?
Ah i didn't configure it for my user...sorry. Then my Problem is 'fixed'. Thx for your help.
In the documentation you list unlimited wiki instances as a feature, can I run a search across all index and return results for all the wikis?
This could have been done by modifying the source code. "Unlimited instances" is first related to the indexer, meaning you have a server with several MediaWiki-installations and whish to index all with a single nightly cron job.
Could you possibly elaborate on that? I have a Wiki family (single server running multiple wikis, each has a separate database) and would like to be able to search across them.
This is no standard feature of ZSL and would require approximately several days of investigations and development. Unfortunately I can't provide this on a non profit base. This could also be a preformance related issue and may fail. We currently have no experience on doing so. If all this doesn't matter you could require a quote on www.wiki-service.biz. You could also check an alternative enterprise search engine like Solr.
c u stevie
Hello,
I installed the extension yesterday and tried to have it working but I always get the same error message when running the indexer:
2011-11-09T18:29:52+01:00 INFO (6): LuceneIndexer Error-Message! ERROR dumpXML() "C:\Program Files\EasyPHP 3.0\php\php.exe" -c "C:\Program Files\EasyPHP 3.0\conf_files\php.ini" D:/Wikis/mediawiki-1.17.0/maintenance/dumpBackup.php --current --quiet --uploads > D:/Wikis/search-engine/psl_sources/af_current.xml---Status -> 1
However, when I copy/paste and run myself the command above in the same cmd.exe where I ran the indexer (so in the same conditions normally), this works fine and my af_current.xml is correctly filled with the dump.
Any idea?
(I'm on mediaWiki 1.17 but this has no link with the above mentioned stuff, I think)
SOLVED: Problem was that the directorys were owned by the user apache instead of nobody, which was actually the webserver-user.
Hello everyone,
I´ve installed ZF and the Zendsearchlucene for Mediawiki extension.
The index-process runs smoothly and without an error.
Problem is, that I get a blank page as soon as I hit the search button. No error-message and no hint what´s wrong.
Here´s my config:
Webroot is /opt/lampp/htdocs (+/mediawiki as symlink to mw installation path /opt/lampp/htdocs/mediawiki_1.17.0)
ZendF resides in /opt/lampp/zend/
The relevant parts of the config are:
Index Config
$GuiFlag = 0;
$wikisArray[0]['xmlSource'] = "/opt/lampp/zend/sources/internal_current.xml";
$wikisArray[0]['indexName'] = "wikidb_internal";
$wikisArray[0]['maintenanceScript'] = "/opt/lampp/htdocs/mediawiki/maintenance/dumpBackup.php";
$wikisArray[0]['mediaDir'] = "/opt/lampp/htdocs/mediawiki/images";// maybe httpdocs/images/ if img_auth.php not in use
#$wikisArray[1]['xmlSource'] = "/opt/lampp/zend/sources/sysdoc_current.xml";
#$wikisArray[1]['indexName'] = "wikidb_sysdoc";
#$wikisArray[1]['maintenanceScript'] = "/opt/lampp/htdocs/mediawiki/maintenance/dumpBackup.php";
#$wikisArray[1]['mediaDir'] = "/opt/lampp/htdocs/mediawiki/images/";// maybe httpdocs/images/ if img_auth.php not in use
#[...]
endif;
@preg_match_all("/(Windows)(.*?)/", $_SERVER['OS'], $matched, PREG_SET_ORDER);
/* an index dir above web root */
$indexDirName = "psl_search_indexes";
$PhpExecutionStringUnix = "/opt/lampp/bin/php -c /opt/lampp/etc/php.ini";
$PhpExecutionStringWindows = "c:\\xampp\\php\\php.exe ";
$email = [mailto:info@my.reporting-mailadress.ork info@my.reporting-mailadress.ork];
/* file formats which will be indexed */
$additionalFileFormatsArray = array('pdf','docx','xlsx','pptx','sql','vnd','txt','xml','xmlx','csv');
if(count($matched) > 0 ):
/* modify this to fit your needs, if you are on windows */
$webServerUser = "";
$webServerUserGroup = "";
$zendFrameworkLibraryPath = "C:\\xampp\\htdocs/ZF/library";
$zendLogPath = "C:\\xampp\\htdocs\\".$indexDirName."\\";
$applicationPath = "C:\\xampp\\htdocs";
else:
/* modify this to fit your needs, if you are on unix */
$webServerUser = "apache";
$webServerUserGroup = "apache";
$zendFrameworkLibraryPath = "/opt/lampp/zend/library";
$zendLogPath = "/opt/lampp/zend/".$indexDirName."/";
$applicationPath = "/opt/lampp/zend/";
endif;
Local-settings from Mediawiki
/* Configuration Zend Search Lucene for MediaWiki - Start */
$PslDomainDir = "internal";
$PslPhpExecutionStringUnix = "/opt/lampp/bin/php -c /opt/lampp/etc/php.ini ";
$PslMaintenancePath = "/opt/lampp/htdocs/mediawiki/maintenance/";
$PslXmlPath = "/opt/lampp/zend/sources/".$PslDomainDir."_current.xml";
$wgPslZslAdminUseAutoReIndex = false;
$wgPslZslAdminDefaultEmail = "<your email address>";
$wgPslZslAdminDumpString = $PslPhpExecutionStringUnix.$PslMaintenancePath."dumpBackup.php --current --quiet --uploads > ".$PslXmlPath;
$wgPslZslAdminMediaDir = "/opt/lampp/htdocs/mediawiki/images/";
$wgPslZslAdminReIndexString = $PslPhpExecutionStringUnix."/opt/lampp/zend/PslZendSearchLuceneIndexer.php ".$PslXmlPath." wikidb_".$PslDomainDir." ". $PslMaintenancePath."dumpBackup.php";
require_once( "$IP/extensions/PslZslAdmin/PslZslAdmin.php");
$wgSearchType = 'PslZendSearchLucene';
$wgPslEnableSuggestions = true;//enables suggestions
$wgPslEnableStopWords = false;//enables stop words
$wgPslStopWords = array('aber','als','am','an');
$wgPslImagePath = [http://172.23.101.63/mediawiki/extensions/PslZendSearchLucene/ http://172.23.101.63/mediawiki/extensions/PslZendSearchLucene/];
$wgPslWikiUrl = "http://172.23.101.63/mediawiki-1.17.0/index.php/";
$wgPslEntriesPerPage = 20;
$wgPslUtf8DecodeResults = false;//utf8-hint for related display issues, (play around with this if needed)
$wgPslIndexDir = "/opt/lampp/zend/psl_search_indexes/wikidb_".$PslDomainDir;
$wgPslZendLibraryDir = "/opt/lampp/zend/library/";
$wgPslEnablePopularSearches = true;//requires table-create rights for MediaWikis db-account
$wgPslPopularSearchesHistory = 365;//data remains 365 days
$wgPslProtectPopularSearches = false;//
$wgPslHighlightColor = "#ff6900";
$wgPslEnabaleDebugMode = false;//debug mode
$wgPslEnableSuggestions = true;//enables suggestions
$wgPslEnableFileSearch = true;//enables file search
$wgPslEnablePsIpTracking = false;//enables ip tracking for geo lacation services etc. (currently not implemented)
$wgPslEnableAnonKey = true;//anonymous key for science
$wgPslHistoryEntries = 30;//history entries per page
$wgPslHistoryMiniStat = true;
$wgHiddenPrefs[] = 'searchlimit';//this entries are disabling no more needed user preferences of the old/default search
$wgHiddenPrefs[] = 'contextlines';
$wgHiddenPrefs[] = 'contextchars';
$wgHiddenPrefs[] = 'disablesuggest';
$wgHiddenPrefs[] = 'searcheverything';
$wgHiddenPrefs[] = 'searchnamespaces';
$wgPslEnableUserInHistory = false;//enhanced knowledge management feature could tackle your country specific law!
require_once( "$IP/extensions/PslZendSearchLucene/PslZendSearchLucene.php");
/* Configuration Zend Search Lucene for MediaWiki - End */
Anyone an idea? I already tried several combinations of paths but none would work.
Greetings, and thanks in advance...
F
On one hand, i finally got Zend Lucene to work, and on mediawiki 1.17.0, but not without having virtually any documentation that could help.
MediaWiki 1.17.0
PHP 5.3.6 (apache2handler)
MySQL 5.2.8-MariaDB-log
Step 1 - Install / Download Zend Framework Download Zend Framework. Unpack and copy the contents of the download file to a webserver folder (commonly not below web root). Zend Framework install is done! You're NOT exactly done. Here, I had to create a directory called 'zend', above web root (i.e. /var/www/zend) where i extracted the contents of the tar ball.
Step 2 - Configure Zend Search Lucene for MediaWiki Download and extract the extensions PslZslAdmin and PslZendSearchLucene to your Wiki(s) extension directory. Move the files PslZendSearchLuceneIndexer.php and PslZendSearchLuceneIndexerConfig.php to a server directory above web root. Edit the marked parts of the file PslZendSearchLuceneIndexerConfig.php as described in it.
This is where it gets tricky. The config file (PslZendSearchLuceneIndexerConfig.php) unfortunately lacks the proper comments to help any admin intuitively configure Zend in any reasonable ammount of time. Here are some suggestions i would recommend to any admin out there wanting to install Lucene:
1.) In my case, i didn't have an xml repository for my db dumps. So I created a 'source' folder where the xml dumps will be housed. I just created a 'sources' directory in <full path of Zend installation>/sources
2.) Here's how i would've labeled the parameters in the config file:
Instead of:
$wikisArray[0]['xmlSource'] = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/internal/internal_current.xml";
I would've put:
$wikisArray[0]['xmlSource'] = "<full path of xml dumps>/internal_current.xml";
In my case, again, i created a directory specifically designated for these dumps, which was /var/www/zend/sources
Instead of:
$wikisArray[0]['maintenanceScript'] = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/internal/httpdocs/wiki/maintenance/dumpBackup.php"; $wikisArray[0]['mediaDir'] = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/internal/public/";// maybe httpdocs/images/ if img_auth.php not in use
I would've put:
$wikisArray[0]['maintenanceScript'] = "<full path of mediawiki installation>/maintenance/dumpBackup.php"; $wikisArray[0]['mediaDir'] = "<full path of mediawiki installation>/images/";
In my case, <full path of mediawiki installation> = /var/www/html/wiki/mediawiki-1.17.0/
Instead of:
$wikisArray[1]['xmlSource'] = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/sysdoc/sysdoc_current.xml";
I would've put:
$wikisArray[1]['xmlSource'] = "<full path of xml dumps>/sysdoc_current.xml";
In my case, again, i created a directory specifically designated for these dumps, which was /var/www/zend/sources
Instead of
$wikisArray[1]['maintenanceScript'] = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/sysdoc/httpdocs/maintenance/dumpBackup.php"; $wikisArray[1]['mediaDir'] = "/var/www/vhosts/indi.sexyserver4you.de/subdomains/sysdoc/public/";// maybe httpdocs/images/ if img_auth.php not in use
I would've put:
$wikisArray[1]['maintenanceScript'] = "<full path of mediawiki installation>/maintenance/dumpBackup.php"; $wikisArray[1]['mediaDir'] = "<full path of mediawiki installation>/images/";
Instead of:
$PhpExecutionStringUnix = "/usr/bin/php -c /etc/php5/cli/php.ini";
I would've put:
$PhpExecutionStringUnix = "/usr/bin/php -c /<location of php.ini file>/php.ini";
In my case <location of php.ini file> = /etc/
Instead of:
$webServerUser = "www-data"; $webServerUserGroup = "psaserv"; $zendFrameworkLibraryPath = "/PSL_ADD_ONS/ZF/library"; $zendLogPath = "/PSL_ADD_ONS/".$indexDirName."/"; $applicationPath = "/PSL_ADD_ONS";
I would've put:
$webServerUser = "<your web server user>"; $webServerUserGroup = "<your web server group>"; $zendFrameworkLibraryPath = "/<installation path of Zend>/ZendFramework-1.11.10/library"; $zendLogPath = "/<installation path of Zend>/".$indexDirName."/"; $applicationPath = "/<installation path of Zend>/";
3.) Here's how i would've labeled the config parameters for LocalSettings.php
/* Configuration Zend Search Lucene for MediaWiki - Start */
$PslDomainDir = "sysdoc";
$PslPhpExecutionStringUnix = "/usr/bin/php -c /<full path to php.ini file>/php.ini ";
$PslMaintenancePath = "/<full path to mediawiki installation>/maintenance/";
$PslXmlPath = "/<full path to xml dumps or sources>/".$PslDomainDir."_current.xml";
$wgPslZslAdminUseAutoReIndex = false;
$wgPslZslAdminDefaultEmail = "<your email address>";
$wgPslZslAdminDumpString = $PslPhpExecutionStringUnix.$PslMaintenancePath."dumpBackup.php --current --quiet --uploads > ".$PslXmlPath;
$wgPslZslAdminMediaDir = "<full path of mediawiki installation ir directory where you store uploaded docs>/images/";
$wgPslZslAdminReIndexString = $PslPhpExecutionStringUnix."/<full path to Zend Installation>/PslZendSearchLuceneIndexer.php ".$PslXmlPath." wikidb_".$PslDomainDir." ". $PslMaintenancePath."dumpBackup.php";
require_once( "$IP/extensions/PslZslAdmin/PslZslAdmin.php");
$wgSearchType = 'PslZendSearchLucene';
$wgPslEnableSuggestions = true;//enables suggestions
$wgPslEnableStopWords = false;//enables stop words
$wgPslStopWords = array('aber','als','am','an');
$wgPslImagePath = "http://<wiki domain or ip address of wiki server>/extensions/PslZendSearchLucene/";
$wgPslWikiUrl = "http://<wiki url>/mediawiki-1.17.0/index.php/";
$wgPslEntriesPerPage = 20;
$wgPslUtf8DecodeResults = false;//utf8-hint for related display issues, (play around with this if needed)
$wgPslIndexDir = "/<full path to Zend Installation>/psl_search_indexes/wikidb_".$PslDomainDir;
$wgPslZendLibraryDir = "/<full path to Zend Installation>/ZendFramework-1.11.10/library/";
$wgPslEnablePopularSearches = true;//requires table-create rights for MediaWikis db-account
$wgPslPopularSearchesHistory = 365;//data remains 365 days
$wgPslProtectPopularSearches = false;//
$wgPslHighlightColor = "#ff6900";
$wgPslEnabaleDebugMode = false;//debug mode
$wgPslEnableSuggestions = true;//enables suggestions
$wgPslEnableFileSearch = true;//enables file search
$wgPslEnablePsIpTracking = false;//enables ip tracking for geo lacation services etc. (currently not implemented)
$wgPslEnableAnonKey = true;//anonymous key for science
$wgPslHistoryEntries = 30;//history entries per page
$wgPslHistoryMiniStat = true;
$wgHiddenPrefs[] = 'searchlimit';//this entries are disabling no more needed user preferences of the old/default search
$wgHiddenPrefs[] = 'contextlines';
$wgHiddenPrefs[] = 'contextchars';
$wgHiddenPrefs[] = 'disablesuggest';
$wgHiddenPrefs[] = 'searcheverything';
$wgHiddenPrefs[] = 'searchnamespaces';
$wgPslEnableUserInHistory = false;//enhanced knowledge management feature could tackle your country specific law!
require_once( "$IP/extensions/PslZendSearchLucene/PslZendSearchLucene.php");
/* Configuration Zend Search Lucene for MediaWiki - End */
4.) Finally, make sure you check your permissions! Your xml dump/source directory must be owned by the web server user and in the web server group. In my case, apache:apache. A simply chown -R apache:apache . in the Zend installation directory should do the trick.
Hope this helps!
When I try to open "Special Pages" I get an error:
Fatal error: Call to undefined method User::getOptions() in /users/lenjo/www/mediawiki-1.15.1/extensions/PslZendSearchLucene/PslZendSearchLucene_body.php on line 116
Where is the method defined? How can I fix that?
Edit: I found the class where getOptions is defined. But that does not solve the problem...
You may have the wrong MW-version.
The PslZendSearchLuceneIndexerConfig.php file comes with "<?" at the start of the file, should probably be "<?php" as not all of us have short_open_tag enabled.
Invalid argument supplied for foreach() in C:\xampp\htdocs\PslZend SearchLuceneIndexer.php on line 421
What to repair when indexing has this result? Thanks...
PHP Warning: Invalid argument supplied for foreach() in C:\xampp\htdocs\PslZend SearchLuceneIndexer.php on line 421
Warning: Invalid argument supplied for foreach() in C:\xampp\htdocs\PslZendSearc hLuceneIndexer.php on line 421 PHP Warning: Invalid argument supplied for foreach() in C:\xampp\htdocs\PslZend SearchLuceneIndexer.php on line 430
Warning: Invalid argument supplied for foreach() in C:\xampp\htdocs\PslZendSearc hLuceneIndexer.php on line 430
Hi, seems $domArr['mediawiki']['page'] is emty...this could mean you have no XML-Data to parse...your MediaWiki data extraction fails, or your Wiki has no pages.
Thank you so much for your reply. But the database dump was successful and internal_current.xml file was successfully created too. All other settings are almost exactly according to guide how to install it.
As I understand indexer is independent on LocalSettings.php, so shouldn't matter if I have some misconfiguration there.
Yes indexer is independent on LocalSettings.php.
Thanks for the reply. I have found that my wiki was dumping XML file with not allowed character at the beginning, after trimming it, indexing was OK.
I have multilingual smw wiki in English and Japanese in single database and on search results in Japanese I get error in Search results:
Warning: preg_replace() [function.preg-replace]: Compilation failed: unrecognized character after (? or (?- at offset 2 in /my/path/extensions/PslZendSearchLucene/PslZendSearchLucene_body.php on line 1525
anyway the Zend Extansion finds correct page but :
- "Text" is not displayed under the search results for Japanese
- foreign UTF8 characters (like other languages in English text) are displayed as ?
- words inside Japanese sentence are not indexed (as there are no spaces between words in Japanese)
$wgPslUtf8DecodeResults just turns Japanese page names to ????
ad Mediawiki MW Search:
- same result as mentioned upper
- foreign UTF8 characters are displayed correct in normal MW Search...
- words inside Japanese sentence are not indexed (default MW Search probably cannot deal with this)
I don't know what of the mentioned is my misconfiguration and what real troubles, just wanted to share overall result from testing by normal Mediawiki user (not a PHP expert).
Hi Satori,
I will try to reply accordingly from developers point of view- for you and following visitors. As I know Semantic MediaWiki (smw) is a complete "other peace of Software", or drastically modified MediaWiki. We never tested ZSL for MediaWiki against this branch. There is another comment describing problems with Japanese language...so we might can say ZSL is currently not ready for Japan ;-). But we recognized many downloads from other countries all over the world (without any bug postings) and use it with UTF8 in german language. So we could say it's a stable ZSL release accordingly to the requirements and test scenarios mentioned at the main page.
In the following configuration of PslZendSearchLuceneIndexerConfig.php, how do you establish or figure out the path to the xmlsource file? do you need to create the xml file yourself? should it be found somewhere? I'm trying to get this installed on windows and I'm not having the best of luck.
$wikisArray[0]['xmlSource'] = "D:\xampp\htdocs\mediawiki\internal_current.xml"; $wikisArray[0]['indexName'] = "TESTWiki";
$wikisArray[0]['maintenanceScript'] = "D:\xampp\htdocs\mediawiki\maintenance\dumpBackup.php"; $wikisArray[0]['mediaDir'] = "D:\xampp\htdocs\mediawiki\images";// maybe httpdocs/images/
Hi,
could you explain to me how the paths within PslZendSearchLuceneIndexerConfig.php must be configured to get the whole thing working ? I made following configurations:
$GuiFlag = 0;
$wikisArray[0]['xmlSource'] = "D:\xampp\htdocs\mediawiki\internal_current.xml"; $wikisArray[0]['indexName'] = "TESTWiki"; $wikisArray[0]['maintenanceScript'] = "D:\xampp\htdocs\mediawiki\maintenance\dumpBackup.php"; $wikisArray[0]['mediaDir'] = "D:\xampp\htdocs\mediawiki\images";// maybe httpdocs/images/ if img_auth.php not in use
I get the following error, when I try to run the Indexer
C:\>D:\xampp\php\php.exe -f D:\xampp\htdocs\ZendFramework\PslZendSearchLuceneIndexer.php PHP Warning: require_once(Zend/Search/Lucene.php): failed to open stream: No such file or directory in D:\xampp\htdocs\ZendFramework\PslZendSearchLuceneIndexer.php on line 140
Warning: require_once(Zend/Search/Lucene.php): failed to open stream: No such file or directory in D:\xampp\htdocs\ZendFramework\PslZendSearchLuceneIndexer.php on line 140 PHP Fatal error: require_once(): Failed opening required 'Zend/Search/Lucene.php' (include_path='d: mpp\htdocs\ZendFramework\library') in D:\xampp\htdocs\ZendFramework\PslZendSearchLuceneIndexer.php on line 140
Fatal error: require_once(): Failed opening required 'Zend/Search/Lucene.php' (include_path='d: mpp\htdocs\ZendFramework\library') in D:\xampp\htdocs\ZendFramework\PslZendSearchLuceneIndexer.php on line 140
C:\>
Where is my mistake ?
Thank you
Philipp
Found my mistake: $webServerUser = "";
$webServerUserGroup = ""; $zendFrameworkLibraryPath = "d:\\xampp\\htdocs\\ZendFramework\\library"; $zendLogPath = "d:\xampp\htdocs\\".$indexDirName."\\"; $applicationPath = "d:\xampp\htdocs";
Wrong path under $zendFrameworkLibraryPath = "d:\\xampp\\htdocs\\ZendFramework\\library"; Had to use double backslash.
Fixed it but I still get error-messages in my commandlinewindow-
We are using English, Chinese and Japanese page content, and while the standard MW search and display works in the correct manner, Zend Search brings a error message and does not display the result in the correct character codding.
Warning: Cannot modify header information - headers already sent by (output started at ...extensions\PslZendSearchLucene\PslZendSearchLucene_body.php:452)
Hi MWJames, it currently supports UTF-8, english and german (positive tested, see main page).
@Steviex2: I am testing PslZendSearchLucene-Extension (Version 2.0) with MediaWiki 1.16.4, PHP 5.2.6 and MySQL 5.0.51. The Wiki's content is written in german language, but unfortunately I also receive the following error message (presumably based on coding issues):
Notice: iconv_strlen() [function.iconv-strlen]: Detected an illegal character in input string in {Server-Path}/ZendFramework-1.11.6/library/Zend/Search/Lucene/Search/QueryLexer.php on line 342
Is there any known solution for this problem yet? Grüße!
Hi,
could you please mention the extension version you are using and the full error message.
Its Version 2.0 of the extension and except for "{Server-Path}" the full error message (compare my post).
I' m not really sure- but believe that I read something about it while developing, maybe in conjunction with the used zend framework version or the iconv-configuration in php.ini. There are several german wikis in use with this extension (in production mode)- never heard about it again (sorry ;-)). You also could google this issue like me while developing. I'm sure there is an answer for this. Would be nice to leave a note after fixing.
c u
When can we have the possibility to search inside document ( all office format and pdf ) ? Any idea of the release date ? Additionnal question : when we upload a new document or make a new article, does the index is automatically up to date or do we need to launch a complete indexation with a job ?
There is currently no timeline to implement new features mentioned as todo's on the extension main page, as long I don't receive urgent, further comercial development assignments. I will do this for sure, but please consider every single kind of file-extension needs a serious programming job in search engine land. Reindexing is needed by every search engine. You can do this incremental or full. The implementation of this could vary. As mentioned a common way to do this is triggering the indexer script by a cronjob. But theoretically this could be happen after every editing action (may be a little bit crazy).
UPDATE: All this features are realized with the next upcoming release (see main page announcements).
As for now PslZendSearchLuceneIndexer.php will always initiate a full index update which cost immense system resources and takes an amount of time to be finished. Is their are a way, PslZendSearchLuceneIndexer.php has an incremental update modus, so that updates can be scheduled on regularly (incremental) basis and full updates only on special occasions?
I noticed in my environment, that incremental update takes more time then a full update, however you can set "private $incrementUpdate" to true.
There is coming an update with an easy to edit config file for the Lucene Indexer. It also provides an admin-UI for manually reindexing in full and incremental mode and a config var called $wgPslZslAdminUseAutoReIndex, which will cause reindexing on article save events. count on it...coming within the next few days :-).
I am a newer for Mediawiki. I use the Mediawiki on my comuputer (windows operation system, PHP is in the directory C:\xampp\, Wiki is in the directory D:\www\htdocs\, SQL is in the directory D\www\mysql\). It's very good, and I want to find a search engine. I think this extension is very good. During the process of installation, I have some questions as following:
1 In the frist step, I have downloaded the Zend framwork. I put the contents in \htdocs\psl-suche\. Is it right? is there some paths to be added or changed?
2 In the second step, I move the file PslZendSearchLuceneIndexer.php to \htdocs\psl-suche\. How can I "Edit the marked parts of this file as descriped in it." ?
Thanks a lot!
Hi Simonlsw,
1. You can put the ZendFramework every where you want as long you point to it in LocalSettings.php and the directory is accessable by PHP. For first success "\htdocs\psl-suche\" is a good idea, but keep in mind you can have it also above web root or another dirname then "psl-suche".
2. Open the file "PslZendSearchLuceneIndexer.php" in your prefered IDE or simply text editor and follow the instruction. Remember to trigger this file to produce a searchable Lucene Index.
We recognized that while using ($wgSearchType = 'PslZendSearchLucene') as standard search with the cost of a large performance drop in comparison with the standard MW search (same search term with MW search (under 2 sec.) Zend Search over 1 min.) but we would consider Zend Search as additional search option. We found that the Special Page can be used with an url api option. P
{{fullurl:Special:PslZendSearchLucene|query= " search term" &PslSearchMode=2}}
Is their a possibility to have an option that redirects are not shown in the result display, similar to the search options (PslSearchMode=1 or 2 or 3)?
I wrote this plugin mainly for a customer. There we have server with many different MediaWiki-instances. Every Wiki has ~ 4000 Lucene documents. Till now we recognized no performance issues. And yes there is always a possibility to add more options, it's OOP and Open Source :-). I have some other ToDos first (see main page).
Under Windows we had to maintain directories with a double slash "\\" otherwise an error message would appear.
$wikisArray[0]['maintenanceScript'] = "...\\maintenance\\dumpBackup.php"; ... private $PhpExecutionStringWindows = "...\\php\\php.exe";