Extension:OAIRepository

From the README
This is an extension to MediaWiki to provide an OAI-PMH repository interface by which page updates can be snarfed in a relatively sane fashion to a mirror site.

OAI-PMH protocol specs: http://www.openarchives.org/OAI/openarchivesprotocol.html

A harvester script forms the client half. Apply oaiharvest_table.sql to clients to allow saving a checkpointing record; this ensures consistent update ordering.

At the moment this script is quite experimental; it may not implement the whole spec yet, and hooks for actually updating may not be complete.

The extension adds an 'updates' table which associates last-edit timestamps with cur_id values. A separate table is used so it can also hold entries for cur rows which have been deleted, allowing this to be explicitly mentioned to a harvester even if it comes back after quite a while.

Clients will get only the latest current update; this does not include complete old page entries by design, as basic mirrors generally don't need to maintain that extra stuff.

As of May 19, 2008, the updater will attempt to update the links tables on edits, and can fetch uploaded image files automatically.

(Uploads must be enabled locally with  or no files will be fetched. image table records will be updated either way.)

Settings
''From the talk page... This comes from the CommonSettings.php (similar to LocalSettings.php in most MediaWiki installations) on actual Wikimedia servers.''

Add to localSettings.php :

MySQL part
I did this from the command line, so bear with me and/or adapt to the graphical version. It's assumed here you know the mySQL root password.

mysql wikidb -uroot -p < update_table.sql
 * Replace  in update_table.sql with the actual value of the prefix (which is set in LocalSettings.php).
 * update_table.sql goes for the wiki DB (replace wikidb with your wiki database name if necessary). NOTE: This will take a significant amount of time on rather large wikis.

mysql oai -uroot -p < oaiaudit_table.sql mysql oai -uroot -p < oaiharvest_table.sql mysql oai -uroot -p < oaiuser_table.sql
 * oaiuser_table.sql, oaiharvest_table.sql , oaiaudit_table.sql goes for an OAI DB, at which the wiki DB user must have access
 * If you want everything in the same database follow section 1, otherwise follow step 2.
 * EITHER Change the following in LocalSettings.php:
 * OR Create a separate database for the oai info.
 * log to mysql mysql -uroot -p
 * Once inside, create the oai database and give your "wiki" user (the login used in your LocalSettings.php for mySQL connections) all rights on it CREATE DATABASE oai; GRANT ALL PRIVILEGES ON oai.* TO 'wikiuser'@'localhost';  FLUSH PRIVILEGES;  exit
 * Go into the remaining .sql files and make sure to add your table prefix that is found in your LocalSettings.php as the value for .  Replace each instance of   with the actual prefix.
 * Create the tables by feeding the commands to mysql (where "oai" is the database you are putting the data into and "root" is your mySQL user):

echo "INSERT INTO /*$wgDBprefix*/oaiuser(ou_name, ou_password_hash) VALUES ('thename', md5('thepassword') );" | mysql oai -uroot -p
 * to be able to log to the OAIRepository, you'll have to add a login to the oaiuser table. These should be the same values in   and   (again, remember to replace   with the table prefix for your wiki.

Install on Lucene-search server
Follow the instructions on the Lucene-search page for incremental updates.

NOTE
The current version of OAI won't work with MW1.12 or lower, since the add of wfGetLB (LBFactory abstract class) in 32578. To uses with 1.12, download this version of the files. Make sure you use the ExtensionDistributor by going to the "download snapshot" link in the infobox to help you get the right version.