Extension:Memento

From MediaWiki.org
Jump to: navigation, search
MediaWiki extensions manual
Crystal Clear action run.png
Memento

Release status: stable

Memento logo 128.png
Implementation Data extraction, User interface
Description Performs content negotiation in the DateTime dimension, allowing one to see past versions of wiki pages using the Memento protocol.
Author(s) Shawn Jones,
Harihar Shankar
Latest version 2.1.0 (2014-09-20)
MediaWiki 1.21.1+
PHP 5.3.8+
Database changes No
License GPL
Download
Example Demo Wiki
Parameters

See the Configuration section

Hooks used
BeforePageDisplay

ArticleViewHeader
BeforeParserFetchTemplateAndtitle
ImageBeforeProduceHTML

Translate the Memento extension if it is available at translatewiki.net

Check usage and version matrix; code metrics

The Memento extension allows one to browse an entire MediaWiki site as if it were a date in the past; to do this it adds support for datetime negotiation as specified by the Memento protocol to a MediaWiki system. It provides server-side support to allow a Memento client to navigate a MediaWiki system as it was at a time in the past chosen by a user. To gain an understanding about the functionality provided by Memento's time travel, check out this brief overview that includes temporal navigations across Wikipedia and other web sites.

The Memento "Time Travel for the Web" effort started in 2009 with the overall goal of making it as easy to navigate the past of the web as it is to navigate the current web. The basic idea underlying the Memento protocol is that an old version of a web page, such as a version of a Wikipedia article - http://en.wikipedia.org/w/index.php?title=Web_archiving&oldid=526371727 - can be retrieved by accessing its original URI - http://en.wikipedia.org/wiki/Web_archiving - and by applying datetime negotiation to it. Datetime negotiation is similar to content negotiation, which is used frequently by browsers, for example, to ask a server for a version of a page in a specific format e.g. HTML or PDF. Datetime negotiation asks the server for a version with a specific date, and uses the special purpose Accept-Datetime HTTP header to do so.

The Memento protocol is meanwhile natively supported by several web archives. Also, all versions of DBpedia are natively accessible via the Memento protocol, and proxy support for all language version of Wikipedia has been implemented.

Two MediaWiki extensions provide support for Memento. This page shows information for the full Memento Extension. If you do not need the power of the full Memento Extension, the Memento Headers Extension also offers support for Memento, but without some of the additional functionality that the full extension provides. The table below highlights the differences between the two extensions.

Extension:Memento Extension:MementoHeaders
Provides Memento headers for articles
Provides Memento headers for oldid pages
Can exclude namespaces from datetime negotiation
Requires external service for datetime negotiation
Performs datetime negotiation within MediaWiki (TimeGate)
Provides list of revisions in machine-readable format (TimeMap)
Ensures Template revisions match the revision ofthe page in which they are embedded
Ensures image revisions match the revision of the page in which they are embedded(experimental)

Memento's time travel is not yet natively supported in browsers and hence requires installing an extension. A Memento extension for Chrome, fully compliant with the most recent protocol specification, was released the end of September 2013. A screencam illustrates the extraordinary time travel functionality it provides.

Use Cases[edit | edit source]

  • Users often wish to see versions of resources prior to certain events, for example, the page about Michael Jackson in Wikipedia both before his death. And, once they are on an old version of that page, they may want to see what other Wikipedia pages linked from the Michael Jackson page looked like at that time.
  • MediaWikis are used for scientific purposes, for example, as platforms to provide and maintain terminology definitions. Such definitions may change over time and may be interrelated. From the perspective of scholarly discourse, it may be important to be able to see exactly what the interrelated definitions looked like, for example, when they were used in a scholarly publication.
  • MediaWikis are used as fan platforms, for example, providing all details about Game of Thrones. Because the TV episodes don't air at the same time across the world, many current pages in such fan platforms contain spoilers. In this case, a user may want to set a time travel date for the fan site that aligns with the episodes that have aired in her region.
  • Pages in MediaWikis commonly contain links to the web at large. Sometimes it is helpful, and sometimes necessary (e.g. when links are dead) to see the state of such linked resources at some time in the past, for example, at the time when a linked resource was accessed by the editor. This functionality is supported out of the box by a Memento client and thus not require a MediaWiki to install this Memento extension. However, with the extension installed, temporal navigation is possible both within and outside of the MediaWiki, providing for seamless web time travel.
  • MediaWiki editors can benefit from Memento time travel by being able to easily visit the state of interrelated pages at some time in the past, for example, to assess the differences before and after editing wars.
  • By supporting datetime negotiation to access page versions, a MediaWiki allows software agents to easily access the state of the entire system as it was at a certain point in the past, or to collect all versions of a page that were published during past time range. This capability may be helpful in support of text mining and data extraction activities and applies to both page-oriented MedaWikis and to data wikis such as Wikidata. It potentially allows for republication of structured information at a frequency that remains in lock step with the evolution of a MediaWiki, rather than in batch mode as is the case with DBpedia. See also the paper [1] with this regard.

How Memento Works[edit | edit source]

This extension allows access to versions of MediaWiki pages by implementing support for the Accept-Datetime HTTP request header to perform datetime negotiation, a variation on content negotiation specified in RFC 2295 [2]. The datetime for negotiation is expressed as a value of the Accept-Datetime HTTP header. The Memento extension for Chrome or the command line mcurl can be used to set this datetime value.

Datetime negotiation works in two simple steps:

  • When a client requests a page, this extension will provide the URI of a TimeGate for the page in the HTTP Link header. The TimeGate is capable of datetime negotiation to access versions of the page.
  • When a client navigates to the TimeGate and performs datetime negotiation with it, the TimeGate provides the client with the version of the page that was operational at the datetime used for negotiation. The creation datetime of that version is provided in the Memento-Datetime HTTP response header, along with links in the HTTP Link header including a link to the current version of the page and to a TimeMap for the page.

This extension also allows access to a TimeMap for a MediaWiki page, which is a document that enumerates all versions of the page as well as a TimeGate for the page. When a client requests a page, this extension will provide the URI of a TimeMap for the page in the HTTP Link header.

This MediaWiki extension uses the same handlers as the MediaWiki software to connect to the database. Hence all the existing database permissions and page access permissions are honored. It uses a 'DB_SLAVE' database connection, which means that the database connection can only read from the tables. Therefore, this plug-in makes no changes to the data in the wiki.

Installation[edit | edit source]

  • Download and extract the tarball in your extensions/ folder. It should generate a new folder called Memento directly inside your extensions/ folder.
  • Add the following code at the bottom of your LocalSettings.php:
require_once "$IP/extensions/Memento/Memento.php";
$wgArticlePath = "$wgScriptPath/index.php/$1";
$wgUsePathInfo = true;
  • Done! Navigate to "Special:Version" on your wiki to verify that the extension is successfully installed.

Configuration[edit | edit source]

This extension has sensible defaults, but also allows the following settings to be added to LocalSettings.php in order to alter its behavior:

  • $wgMementoTimemapNumberOfMementos - (default is 500) allows the user to alter the number of Mementos included in a TimeMap served up by this extension.
  • $wgMementoIncludeNamespaces - is an array of Mediawiki Namespace IDs (e.g. the integer values for Talk, Template, etc.) to include for Mementofication, default is an array containing just 0 (Main); the list of Mediawiki Namespace IDs is at Manual:Namespace
  • $wgMementoTimeNegotiationForThumbnails - EXPERIMENTAL: MediaWiki, by default, does not preserve temporal coherence for its oldid pages. In other words, an oldid (URI-M) page will not contain the version of the image that existed when that page was created. See http://arxiv.org/pdf/1402.0928.pdf for more information on this problem in web archives. This setting has two values:
    • false - (default) do not attempt to match the old version of the image to the requested oldid page
    • true - attempt to match the old version of the image to the requested oldid page

Server Setup[edit | edit source]

In addition to the default MediaWiki installation, this plugin will also work in a setup with URL rewriting. This plugin will also work with wikis in a proxy setup.

TimeGates, TimeMaps and their Workings[edit | edit source]

This extension introduces two new resources to your MediaWiki installation:

  1. TimeGate: A resource associated with a page that supports datetime negotiation to access versions of the page. In the default installation of this extension, the TimeGate URL coincides with the page URL.
  2. TimeMap: A TimeMap for a page is a resource from which a list of URIs of versions of the page is available. The URI list is serialized in application/link-format. The TimeMap is paged: for articles with many revisions, the TimeMap will only return the number of Mementos specified by the configuration parameter $wgMementoTimemapNumberOfMementos. TimeMap URLs to retrieve additional mementos are provided in the TimeMap with the rel attribute "timemap". Please refer to Memento RFC Pattern 6 for more details. Like the TimeGate, the URL to the TimeMap is also available in the Link header and the URL format is: http://your.wikiserver.here/index.php/Special:TimeMap/Title

Usage[edit | edit source]

The best way to experience this extension is by installing Memento Time Travel for the Chrome browser. After installing Memento Time Travel, enter the URL of a page in your wiki and set the desired date-time. Memento Time Travel will use the TimeGate installed in the wiki to load the version of the article that was live at the requested date-time.

After setting the date-time in Memento Time Travel, a user can click both the internal and external links in the page and navigate the web in the past.

This extension can also be used and tested in two other ways:

  1. Using a Firefox browser: Install the Modify Headers Firefox extension. Then set the Accept-Datetime header from the Tools/Modify Headers menu option. The syntax to use is Accept-Datetime: Sat, 03 Oct 2009 10:00:00 GMT. Set it to a date-time at which your wiki was already generating history pages. Then enter a URL of a page from your wiki that has associated history pages around the date-time you chose. Using the Live HTTP Headers Firefox Extension, the request and response headers involved in this transaction can be seen. The URL to the TimeGate can be obtained from the <Link> header, using this extension. This URL can be used to navigate to the TimeGate.
  2. Using the UNIX command line tool curl:
curl -o null.html -D headers.txt -H "Accept-Datetime: Sat, 03 Oct 2009 10:00:00 GMT" 
http://your.wikiserver.here/your-title-here

For complete information about the memento framework and it's request - response headers, please refer to the IETF Memento Internet Draft.

Templates[edit | edit source]

MediaWiki, by default, retrieves the most recent version of a template when transcluded in an article. This extension allows datetime content negotiations on transcluded templates.

Special Pages[edit | edit source]

Special pages under the URL http://your.wikiserver.here/index.php/Special:SpecialPages do not have a history, i.e. there are no revisions to these pages. Hence, the Memento extension cannot perform time negotiations on these resources.

Deleted Contributions[edit | edit source]

This plugin does not make any deleted revisions accessible.

Timestamps[edit | edit source]

This extension searches for and retrieves the mementos using the modified time of an article. Timestamps are not unique identifiers and it is possible that an article will have more than one revision at any given time. This extension handles this situation by redirecting to the revision that has the highest revision id.

MediaWiki does not resolve deleted revisions using revision ids, but use timestamps instead in their URIs. Hence, we could not come up with a way to resolve a situation when more than one deleted revision has the same timestamp.

Wikis with Memento Plug-in Installed[edit | edit source]

See also[edit | edit source]

  1. An HTTP-Based Versioning Mechanism for Linked Data http://arxiv.org/abs/1003.3661
  2. RFC 2295. http://www.ietf.org/rfc/rfc2295.txt