Extension:External Data

Description
External Data is a MediaWiki extension that allows for using values on a wiki page that were retrieved from an external URL that contains data in either XML, CSV or JSON format.

It defines two parser functions: #get_external_data and #external_value. #get_external_data retrieves the data from a URL that holds XML, CSV or JSON data, and assigns it to variables on the page; and #external_value displays the value of any such variable.

In this way, External Data resembles the Variables extension (not to be confused with the other Variables extension), which also allows for setting and then displaying variables; though in this case, the values originate from an external URL, not from the page itself.

The values of XML and JSON files are determined simply from the content of tags, regardless of the tree structure. So, for example, if the tag " red " appeared anywhere in an XML file, the value "red" would become associated with the variable "color".

Currently, a CSV file must be literally a CSV file, i.e., delimited by commas instead of tabs or any other character, for parsing to work.

You can also set caching to be done on the data retrieved, and string replacement to hide API keys; see the "Usage" section, below, for how to do both of those.

Download
You can download the External Data code in either one of these two compressed files:


 * external_data_0.3.tar.gz
 * external_data_0.3.zip

You can also download the code directly via SVN from the MediaWiki source code repository, at http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/ExternalData/. From a command line, you can call the following:

svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/ExternalData/

You can also view the code online here.

Installation
To install this extension, create an 'ExternalData' directory (either by extracting a compressed file or downloading via SVN), and place this directory within the main MediaWiki 'extensions' directory. Then, in the file 'LocalSettings.php', add the following line:

Authors
External Data was written by Yaron Koren, reachable at yaron57 -at- gmail.com; and Michael Dale.

Usage
To get data from an external URL, call the following:

...where 'URL' is the full URL of the XML, CSV or JSON file, 'format' is one of either 'XML', 'CSV' or 'JSON', the "external variable names" are the names of the values in the file (in the case of a CSV file, the names are simply the indexes of the values ("1", "2", "3" etc.)), and the "internal variable names" are the names that are later passed in to #external_value.

More than one #get_external_data call can be used in a page. If this happens, though, make sure that every local variable name is unique.

To display data that was retrieved, call the following:

As an example, this page contains the following text:

.
 * Germany borders the following countries and bodies of water:
 * Germany has population.

The page gets data from a URL at semanticweb.org, generated by the Semantic MediaWiki extension. That URL contains the following text:

Germany,"North Sea,Denmark,Baltic Sea,Poland,Czech Republic,Austria,Switzerland,France,Luxembourg,Belgium,Netherlands","82,411,000",3.5705e+11 m²

Because the page is in CSV format, the "external variable names" are simply indexes. The values are set to the local variables 'bordered countries', 'population' and 'area', respectively.

The page then uses #external_value to display the 'bordered countries' and 'population' values; although it uses the #arraymap function, defined by the Semantic Forms extension, to apply some transformations to the 'bordered countries' value (you can ignore this detail if you want).

Data caching
You can configure External Data to cache the data contained in the URLs that it accesses, both for performance reasons and for the case that any of those external URLs become no longer accessible. To do this, you can run the SQL contained in the extension file 'ExternalData.sql' in your database, which will create the table 'ed_url_cache', then add the following to your LocalSettings.php file, after the inclusion of ExternalData:

String replacement in URLs
One or more of the URLs you use may contain a string that you would prefer to keep secret, like an API key. If that's the case, you can use the array $edgStringReplacements to specify a dummy string you can use in its place. For instance, let's say you want to access the URL "http://worlddata.com/api?country=Guatemala&key=123abcd", but you don't want anyone to know your API key. You can add the following to your LocalSettings.php file, after the inclusion of ExternalData:

Then, in your call to #get_external_data, you can replace the real URL with: "http://worlddata.com/api?country=Guatemala&key=WORLDDATA_KEY".

Version
External Data is currently at version 0.3. Below is the version history:


 * 0.1 - January 12, 2009 - initial version
 * 0.2 - January 14, 2009 - support for JSON data added
 * 0.3 - February 3, 2009 - optional database caching added; string replacement in URLs added

Bugs and feature requests
Send any bug reports, requests or code patches to Yaron Koren, at yaron57 -at- gmail.com.