Extension:OracleTextSearch

What can this extension do?
This extension extends the standard SearchOracle class by adding Oracle Text indexing of files that are stored outside of DB. The search index per key on external data is limited to 2GB in oracle internally.

Indexing can be done over (links to detailed list in Supported Types):
 * word processing and desktop publishing formats
 * spreadsheet and presentation formats
 * database formats (i.e. Access, dBase)
 * archive formats (archived documents are indexed and combined into a single index key)
 * graphic formats (image metadata, EXIF, ...)
 * other formats (executable or library metadata, text in Macromedia Flash, ID3 tags of MP3 files, vCards, ...)

Installation

 * 1) Download the files from SVN and place them in $IP/extensions/OracleTextSearch/
 * 2) Add  into your wiki's LocalSettings.php
 * 3) Enable file uploads
 * 4) Set SearchOracleText as your default search engine
 * 5) Add MIME types you want to index to $wgExIndexMIMETypes (see Supported Types)

Installation DB
As this is very specific extension you have to for now manually execute the patch script provided with the extension. It adds a field to searchindex table and creates a context index using URL_DATASTORE and INSO_FILTER (for 9iR2 compatibility) over it.

Creating such index requires the FILE_ACCESS_ROLE CTX role in newer version of Oracle.

If you hit a ORA-03113 when creating the index, check the Troubleshooting section.

Configuration
Extension has two global parameters:

Rewrite https local urls to http. This can be set to false if DB and web server are connected trough a public network or in case of paranoia. Setting it to false also requires appropriate ACL and Oracle Wallet settings (depending on the version od Oracle DB).

List of MIME types the search engine will consider for indexing.

Supported Types
The list of supported document types varies depending on the version of Oracle DB. Below are links to the lists in most used, supported versions.
 * 11g (11.1)
 * 10gR2 (10.2)
 * 9iR2 (9.2)

Troubleshooting
Oracle Text can sometime be a bit difficult to set up especially if you're using the database on a not completely standard platform. Here are some failures and how to solve them.


 * ORA-03113 (actually ORA-07445) when creating the index
 * Usually caused by the ctxhx helper on the OS if DB is on 64 bit OS as the helper is in certain version compiled for 32 bit
 * Can sometimes be resolved by providing 32 bit compatibility libraries and relinking the ctx, but in most cases you do not have a simple solution and will have to update the DB to the latest patchset or even upgrade the DB
 * Can be checked by running the ctxhx helper manually on the OS
 * Search fails to index/return results with DRG-11222 in logs
 * Same as the above error but can in most cases be resolved by providing 32 bit compatibility libraries and relinking the ctx

Changelog
1.1 - Initial release - 14.06.11