InstantCommons



InstantCommons is a feature of MediaWiki to allow the usage of any uploaded media file from the Wikimedia Commons in any MediaWiki installation world-wide. InstantCommons-enabled wikis cache Commons content so that it is only downloaded once, and subsequent pageviews load the locally existing copy.

Rationale
As of January 2010, the Wikimedia Commons, a central media archive operated by the Wikimedia Foundation, contains almost 6 million files. Each of these files is available under a free content license or in the public domain; there are no restrictions of use beyond those relating to use of official insignia. Licenses which limit commercial use are considered non-free.

As awareness of the Commons grows, so does the desire of external parties to use content included therein, and to contribute new material. It is currently technically possible to load images directly from Wikimedia's servers in the context of any webpage. This is bad for multiple reasons:
 * It does not respect the license terms of the image, and does not allow for other metadata to be reliably transported
 * Besides failing to properly credit the author of the media file, it also does not give credit to Wikimedia
 * It consumes Wikimedia bandwidth on every pageview (unless the image has been cached on the client side or through a proxy)
 * It does not facilitate useful image operations such as thumbnail generation and captioning and is difficult to use in the context of a wiki, particularly for standard layout operations
 * It is tied to URLs as resource identifiers, which complicates mirroring
 * It creates an untrackable external usage web, where any change on Wikimedia's side necessarily affects these external users
 * It does not permit offline viewing, which is crucial in countries which have only intermittent network access.

InstantCommons seeks to address all this by providing an easy method for cached loading of images and metadata from Wikimedia's servers. The first implementation of InstantCommons will be within MediaWiki, allowing for all MediaWiki image operations (thumbnailing, captioning, galleries, etc.) to be performed transparently. However, other wiki engines can implement InstantCommons-like functionality using the API operations described below.

Basic feature set
During the installation, the site administrator can choose whether to enable InstantCommons. Ideally, however, the feature should be enabled by default (provided a writable upload directory is specified) to allow the largest possible number of users to use Wikimedia Commons content.

If the feature is enabled, the wiki behaves like a Wikimedia project, that is, if an image or other media file is referred to which exists on Commons, it can be included in a wiki page like a locally uploaded file by specifying its name. Local filenames take precedence over Commons filenames.

Implementation details
See $wgForeignFileRepos or $wgUseInstantCommons.

Scalability considerations
Because the InstantCommons feature allows a wiki user to download resources from the Wikimedia servers, it is crucial that there is no possibility of a Denial of Service attack against either the using wiki, or the Wikimedia Commons, for example, by pasting 30K of links to the largest files on Wikimedia Commons into a wiki page and pressing "preview".

Therefore, every successful InstantCommons request will have to be logged by the InstantCommons-enabled wiki together with the originating user or IP address and the time of the request. If an individual user overrides a generous internal bandwidth limitation (could be as high as 1 GB by default, but should be user-configurable), future images will not be downloaded within a 24 hour period. This limitation should not exist for wiki administrators (if a wiki admin wants to conduct a denial of service attack against his own wiki, they do not need to be stopped from doing so; if they want to conduct an attack against Wikimedia, they cannot be stopped from doing so except on Wikimedia's end).

In addition to the per-user bandwidth limit, there could be a limit on the size of files which should be downloaded transparently. This would primarily be because files above a certain size would delay pageviews significantly and might even cause the page request to time out. It might be desirable to use an external application for the purpose of downloading these files, so that it can be done in the background without causing the page request to continue. Finally, there could be a total maximum size for the InstantCommons cache; if this size is exceeded, no further files would be downloaded.

While it is unlikely that individual wikis using the InstantCommons feature would cause a significant increase in cost for the Wikimedia Foundation (since every file only has to be downloaded once, and there are per-user bandwidth limitations), it would nevertheless be fair and reasonable for projects using the feature to include a notice on InstantCommons description pages such as: "This file comes from Wikimedia Commons, a media archive hosted by the Wikimedia Foundation. If you would like to support the Wikimedia Foundation, you can donate here ..."

Future potential
In the future, it may be desirable to offer a publisher/subscribe model of changes, which will require wiki-to-wiki authentication and a database of images which are used in subscribing wikis. This would also open up the threat of cross-wiki vandalism, which could be addressed using a delay phase of 24 hours or more for changes to take effect.

Two-way functionality is another possibility, that is, to allow uploading free media directly to Commons from any wiki installation. However, this will require federated authentication as a minimum. It may also necessitate cross-wiki communication facilities to notify users from other wikis about Commons policies, which could be part of a larger project like LiquidThreads.