Extension:OAIRepository

From MediaWiki.org

Jump to: navigation, search
Manual on MediaWiki Extensions
List of MediaWiki Extensions
OAIRepository

Release status: experimental

Implementation Data extraction
Description Provides an OAI-PMH repository interface.
Author(s) Brion Vibber
License No license specified
Download Download snapshot

Subversion [Help]
Browse source code
README

From the README:

This is an extension to MediaWiki to provide an OAI-PMH repository interface by which page updates can be snarfed in a relatively sane fashion to a mirror site.

OAI-PMH protocol specs: http://www.openarchives.org/OAI/openarchivesprotocol.html

A harvester script forms the client half. Apply oaiharvest_table.sql to clients to allow saving a checkpointing record; this ensures consistent update ordering.


At the moment this script is quite experimental; it may not implement the whole spec yet, and hooks for actually updating may not be complete.

The extension adds an 'updates' table which associates last-edit timestamps with cur_id values. A separate table is used so it can also hold entries for cur rows which have been deleted, allowing this to be explicitly mentioned to a harvester even if it comes back after quite a while.

Clients will get only the latest current update; this does not include complete old page entries by design, as basic mirrors generally don't need to maintain that extra stuff.


As of May 19, the updater will attempt to update the links tables on edits, and can fetch uploaded image files automatically.

(Uploads must be enabled locally with $wgEnableUploads = true; or no files will be fetched. image table records will be updated either way.)

Contents

[edit] Install

[edit] Settings

(from the talk page)

add to localSettings.php :

# OAI repository for update server
@include( $IP.'/extensions/OAI/OAIRepo.php' );
$oaiAgentRegex = '/experimental/';
$oaiAuth = true; # broken... squid? php config? wtf
$oaiAudit = true;
$oaiAuditDatabase = 'oai';
$wgDebugLogGroups['oai'] = '/home/wikipedia/logs/oai.log';


[edit] MySQL part

I did this from the command line (linux, but should work for the windows cmd too), so bear with me and/or adapt to the graphical version. It's assumed here you know the mySQL root password.

  • update_table.sql goes for the wiki DB (replace wikidb with your wiki database name if necessary)
mysql wikidb -uroot -p < update_table.sql
  • oaiuser_table.sql , oaiharvest_table.sql , oaiaudit_table.sql goes for an OAI DB, at which the wiki DB user must have access

if you want everything in the same database, change the following in LocalSettings.php . You can ignore the log-in and database creation steps if you do.

$oaiAuditDatabase = 'wikidb'; //your wiki database name
  • log to mysql
mysql -uroot -p
  • Once inside, create the oai database and give your "wiki" user (the login used in your LocalSettings.php for mySQL connections) all rights on it
CREATE DATABASE oai;
GRANT ALL PRIVILEGES ON oai.* TO 'wikiuser'@'localhost';
FLUSH PRIVILEGES;
exit
  • Create the tables by feeding the command to mysql
mysql oai -uroot -p < oaiaudit_table.sql
mysql oai -uroot -p < oaiharvest_table.sql
mysql oai -uroot -p < oaiuser_table.sql
  • to be able to log to the OAIRepository, you'll have to add a login to the oaiuser table
echo  "INSERT INTO oaiuser(ou_name, ou_password_hash) VALUES ('thename', md5('thepassword') );" | mysql oai -uroot -p

[edit] the rest

I know know, it still doesn't work for me. search yourself.

[edit] Notes

  • Current version of OAI won't work with MW1.12 or lower, since the add of wfGetLB() (LBFactory abstract class) in rev:32578. To uses with 1.12, download this version of the files.

This extension is being used on one or more of Wikimedia's wikis. It means that the extension is stable and works well enough to be used by such high traffic websites. A full list of the extensions installed on a particular wiki is produced by Special:Version on that wiki.

Personal tools