Extension:Memento

What can this extension do?
The Memento extension implements support of the X-Accept-Datetime HTTP header to perform content negotiation in the date-time dimension, built on the principles of RFC 2295. This enables MediaWiki to be used as a web archive.

The extension works in three simple steps:
 * Checks for the existence of an X-Accept-Datetime header in the client's request.
 * If the X-Accept-Datetime header exists, redirect the client to a TimeGate.
 * The TimeGate redirects the client to the version of the requested resource that was the live version at the date-time expressed as the value of the X-Accept-Datetime header.
 * If the X-Accept-Datetime header does not exist, handle the client's request as usual. Nothing out of the ordinary will happen.

This plug-in uses the same handlers that MediaWiki does to connect to the database and hence all the existing database permissions and page access permissions are honored. This plug-in only uses a 'DB_SLAVE' database connection, which means that the database connection can only read from the tables. Hence, this plug-in makes no changes to the database.

Download instructions
Please cut and paste the code in. Copy paste the code with the same file name as above in
 * The code for the Memento Client can be found at memento.php.
 * There are four files for the TimeGate.
 * timegate.php
 * timegate_body.php
 * timegate.alias.php
 * timegate.i18n.php

Note: $IP stands for the root directory of your MediaWiki installation, the same directory that holds LocalSettings.php.

Installation
To install this extension, add the following to LocalSettings.php:

TimeGate and It's Workings
The earlier implementations of this plugin combined the functionality of a TimeGate and the Memento Client into one single extension. This implementation however, implements a separate TimeGate. A TimeGate is a resource that enables transparent datetime content negotiation.

The current implementation has two separate plugins:
 * 1) Memento Client: The client plugin makes MediaWiki "Memento aware". That is, if it detects the existence of the X-Accept-Datetime header, then it will redirect the client to the TimeGate.
 * 2) TimeGate: The TimeGate parses the datetime in the X-Accept-Datetime header, performs the content negotiations and redirects the client to the appropriate version that was live during the datetime expressed in the header.

The TimeGate can be accessed directly by http://your.wikiserver.here/index.php/Special:TimeGate. The memento client redirects the original resoure to the TimeGate using the following URI format: http://your.wikiserver.here/index.php/Special:TimeGate/http://your.wikiserver.here/index.php/Title

Usage
Once installed, the extension can be tested and used in two ways:

And then look in headers.txt</tt> to make sure the Locations</tt> header looks similar to:
 * 1) Using a Firefox browser: Install the Modify Headers Firefox extension. Then set the X-Accept-Datetime</tt> header from the Tools/Modify Headers menu option. The syntax to use is X-Accept-Datetime: Sat, 03 Oct 2009 10:00:00 GMT</tt>. Set it to a date-time at which your wiki was already generating history pages. Then enter a URL of a page from your wiki that has associated history pages around the date-time you chose. If all is well you should immediately retrieve the history page that was the active version at the date-time you picked. To see the complete HTTP header interactions, use the Live HTTP Header Firefox Extension. The headers returned will be similar to the ones explained below.
 * 2) Using the UNIX command line tool curl</tt>: To achieve the equivalent of (1) using curl, the command would be:

This URI in the Location</tt> header is the URI to your TimeGate and the original URI is passed as a parameter to this TimeGate. Repeating the curl</tt> command with this new URL in the Location</tt> header, the headers.txt file will look similar to:

The TimeGate performs datetime content negotiation and returns the URI to the memento in the Location header. Notice that the URIs of the oldest and the most recent versions of this resource is also returned, along with the original URI in the Alternates</tt> header. Again, following the Location URI again, using the curl</tt> command returns the following result:

The X-Content-Datetime</tt> header returns the datetime of the content that was returned.

In essence, it is important to note that the memento client redirects the datetime request to the TimeGate and the TimeGate performs the content negotiation and redirects us to a memento. The Alternates</tt> and the X-Content-Datetime</tt> headers are returned to inform us that the returned resource is indeed a memento.

Namespaces
The extension renders the requested page the same way MediaWiki does. It queries the wiki database table page</tt> with the requested title. Both MediaWiki reserved namespaces and custom namespaces are accounted for, by retrieving the namespace_id</tt> from the object $wgTitle</tt>. If the namespace does not exist, then the plug-in treats the namespace also as part of the title. For example, if the requested title is 'Memento:Main_Page', the plugin will first check if a namespace exist for "Memento" and retrieve it's corresponding <tt>namespace_id</tt>. It will then query the <tt>page</tt> table for the title 'Main_Page' with the <tt>namespace_id</tt>. Otherwise, it will treat "Memento" also as part of the title and search the <tt>page</tt> table for the title 'Memento:Main_Page'. If the title could not be found in the database, then an <tt>HTTP/1.1 404 Not Found</tt> is returned.

Templates
Mediawiki by default, retrieves the most recent version of a template when transcluded in an article. This extension cannot perform datetime content negotiations on transcluded templates. However, we have written a quick fix that would perform this operation by adding the following code to the file <tt>Parser.php</tt> in the directory <tt>path/to/wiki/includes/parser/</tt>. Paste the code above in the function <tt>statelessFetchTemplate(...)</tt>, immediately after the variable  is declared. This code will fetch the revision_id for the template for the datetime requested and direct mediawiki to fetch that <tt>rev_id</tt> instead of fetching the latest version of the template using the title. This code's been written for mediawiki version 1.8+.

Caching
Mediawiki by default searches it's cache for templates using the title and retrieves the most recent version. For best result, it is recommended that the caching is disabled for templates so that mediawiki always queries the database for the revision. this can be done by either commenting the respective lines in the function <tt>getTemplateDom</tt> in Parser.php or write a simple code to skip the caching part if the <tt>X-Accept-Datetime</tt> header is detected.

Special Pages
Special pages under the URL http://your.wikiserver.here/index.php/Special:SpecialPages do not have a history, i.e. there are no revisions to these pages. Hence, the Memento extension will return an <tt>HTTP/1.1 406 Not Acceptable</tt>.

Deleted Contributions
To do date-time negotiations for the deleted revisions in MediaWiki, most installations require "Administrator" privileges. Even with administrative access, MediaWiki can only show the revisions in "Edit" mode.

To enable this feature, set the configuration variable <tt>$wgMementoConfigDeleted</tt> to <tt>true</tt>.

Timestamps
This extension searches for and retrieves the revisions for an article using the timestamp of when the revision was generated. Timestamps are not unique identifiers and it is possible that an article will have more than one revision at the same given time. This extension handles this situation by returning an <tt>HTTP/1.1 300 Multiple Choices</tt>, with the list of URIs which were created at the same time. MediaWiki does not resolve deleted revisions using revision ids, but use timestamps instead, in their URIs. Hence, we could not come up with a way to resolve a situation when more than one deleted revision has the same timestamp.