Extension:Zend Search Lucene for MediaWiki

Description
Zend Search Lucene for MediaWiki adds better search capabilities to MediaWiki, includes a huge list of external file formats and aims to bring enhanced knowledge management features into your Wiki applicaton. This engine is mainly designed for science and business collaborations. Unlike many other extended search solutions it depends on nothing else then PHP and an up to date configured webserver. This means:

- NO Java, Python etc. - NO additional compilations - NO trouble with unending dependecie issuses - NO daemons

The plugin ist based on the native Lucene port of the Zend Framework. Therefor it provides the same enterprise level searching features like the famous Java-based original Apache Lucene. Zend Search Lucene for MediaWiki can have been proven and compared (special page), before defining it as the complete replacement search engine.

Outstanding Features
- word search - phrase search - wildcard search - fuzzy search - suggest - handles an unlimit numbers of Wiki instances/indexes - runs on nix and windows based systems - supports restricted wikis - includes a huge list of searchable file formats - enhanced knowledge management features - ready for customizing, extendable OOP - commercial support if needed

Requirements
- PHP >=5.2.3, mbstring enabled (default) - PHP Zend Framework 1.11 - Cron (recommended)

Compatibility
This extension has been successfully checked against the following MediaWiki versions. Others may work too.
 * 1.16 - works - (steviex2)

The extension has been shown to work with the following languages and supports UTF-8.
 * English - works (help page currently german only) - all versions - (steviex2)
 * German - works - all versions - (steviex2)

Step 1 - Install / Download Zend Framework
Download Zend Framework. Unpack and copy the contents of the download file to a webserver folder. Zend Framework install is done!

Step 2 - Configure Zend Search Lucene for MediaWiki
Download and extract the extension to your Wiki(s) extension directory. Move the file PslZendSearchLuceneIndexer.php to a server directory above web root. Edit the marked parts of this file as descriped in it.

Step 3 - Run PslZendSearchLuceneIndexer.php
Run the indexer to prepare for searching: Unix-Example: Windows-Example: Make sure to replace the paths to match your installation! Note: You can automate the PslZendSearchLuceneIndexer.php script call with a cron job- scheduled task.

Step 4 - Extension Installation/Configuration - Local Settings.php
Add the following directives to your LocalSettings.php Make sure to replace the paths to match your installation!

Options
For the most part, the extension's default options do not need any modification. However, there are some configuration options Once again- make sure to replace the paths to match your installation :-)
 * $wgPslEnableStopWords             = false;//enables stop words
 * $wgPslStopWords                   = array('a', 'an', 'at', 'the', 'and', 'or', 'is', 'am');//define your own stopwords here
 * $wgPslImagePath                   = "http://".$_SERVER['HTTP_HOST'].$wgScriptPath."extensions/PslZendSearchLucene/";//for some icons
 * $wgPslWikiUrl                     = "";//url of your wiki installation
 * $wgPslEntriesPerPage              = 30;
 * $wgPslUtf8DecodeResults           = true;//utf8-hint for related display issues
 * $wgPslIndexDir                    = "/PSL_ADD_ONS/psl_search_indexes/wikidb_internal";//index directory
 * $wgPslZendLibraryDir              = "/PSL_ADD_ONS/ZF/library";//path to zend framework library
 * $wgPslEnablePopularSearches       = true;//requires table-create rights for MediaWikis db-account
 * $wgPslPopularSearchesHistory      = 30;//data remains 30 days
 * $wgPslProtectPopularSearches      = true;//
 * $wgPslEnabaleDebugMode            = true;//debug mode
 * $wgPslHighlightColor              = "#ff6900";
 * '''require_once( "$IP/extensions/PslZendSearchLucene/PslZendSearchLucene.php");

Mode Of Operation
By default, this extension will not run as overwrite for the built-in search engine, but instead provide a new Special Page called Extend Search: Zend Search Lucene for MediaWiki or Special:Spezial:PslZendSearchLucene. This allows users to evaluate search results by comparing the built-in search vs. PslZendSearchLucene search.

If the performance is acceptable to replace the built-in search engine, Zend Search Lucene for MediaWiki can easily be configured to be the default search engine. To do so, modify LocalSettings.php and add this before the require_once line that includes the extension: Now, the standard search method will use PslZendSearchLucene by default. Note: when used in this way, the extension preserves the functionality of the Go and Search buttons.

Troubleshooting
If you get something like this: ... you may have to modify your PHP open_basedir-directive to point to your Zend Framework Library. A common example could be to edit an apache vhost.conf:

...and reload it on shell:

ToDo

 * Did you mean
 * DocX / PDF-Search
 * Search within other exotic formats

Revisions

 * v0.9.3 - 09.03.2011
 * implementing of stop words, help system (german)
 * v0.9.2 - 07.03.2011
 * ranking options, refactoring (performance), utf8-bug fix
 * v0.9.1 - 04.03.2011
 * redirect bug fix (default search engine)
 * v0.9.0 - 22.02.2011
 * some enhanced features, default search switch
 * v0.8 - 21.02.2011
 * some enhanced features, debug mode
 * v0.7 - 20.02.2011
 * some enhanced features, search field select
 * v0.6 - 18.02.2011
 * bug fix redirect
 * v0.5 - 14.01.2011
 * bug fix mismatch of uppercase strings
 * v0.4 - 19.01.2011
 * some enhanced features, phrase search
 * v0.3 - 23.01.2011
 * index engine for multiple wikis
 * v0.2 - 16.01.2011
 * some enhanced features, highligting etc.
 * v0.1 - 20.01.2011
 * initial release (RFC)

Announcements
02.04.2011 There will be a major update within the next 4 weeks, containing the following changes and features:
 * seperate config file for LuceneIndexer
 * manually reindexing via admin extension
 * optional, automatic/on the fly reindexing
 * individual user preferences
 * suggest with 4 modes
 * extended help page (currently german only)
 * OS detection to preserve win/unix- configurations
 * additional searchable file formats (a huge list, see below)
 * surprising features :-)
 * minor bug fixes



For further support see also
webserver-management.de