Extension:External Data/Local files

From mediawiki.org

You can get data from files on the server on which the wiki resides, for any files that are of the formats CSV, GFF, HTML, INI, JSON, XML or YAML.

As of version 3.2, the recommended way to retrieve file data is to use one of the display functions (#external_value, #for_external_table, etc.), passing in the necessary parameters for the data retrieval, most notably "source=". You can also retrieve web data by calling the #get_file_data function, or (for version 3.0 and higher) #get_external_data. In any of these cases, you cannot (for privacy reasons) simply specify the local path of the file; you must instead set its information (or information about its directory) in the variable $wgExternalDataSources in LocalSettings.php.

For any of these parser functions, you can also call its corresponding Lua function.

Usage[edit]

The following parameters are specific to retrieving file data:

  • |source= - the file or directory ID
  • |file name= - the file name, if the source is a directory
  • |archive path= - path within the archive, if the file is a .zip, .rar, .tar, tar.bz2 or tar.gz archive. Can be a mask.
  • |archive depth= - depth of archive iteration (default is 2)

In addition, standard parameters such as |data= can be used, and all of the parameters related to the parsing of data (|format=, |delimiter=, |use xpath=, etc.) can be used as well; see Parsing data .

If you want to give the wiki access to one or a small number of files, you could add one or more lines like the following to LocalSettings.php:

$wgExternalDataSources['ID']['path'] = 'local file path';

You would then set "source=" to the ID for that file.

And if there are any directories that you want the wiki to be able to access all files from, you could add one or more lines like the following to LocalSettings.php:

$wgExternalDataSources['ID']['path'] = 'local directory path';

You would then set "source=" to the ID of that directory, and "file name=" to the name of the file you want to access. Note that the External Data code ensures that users cannot do tricks like adding "../.." and so on to the file name to access directories outside of the specified one.

Examples[edit]

To give an example, let's say that a lab wants to publish test results on their wiki. The results are all in CSV files in one directory on a server. So, they might add the following to LocalSettings.php:

$wgExternalDataSources['Our test results']['path'] = '/home/genomelab/data/TestResults/';

Then, a #get_file_data call on the wiki might look like this:

{{#get_file_data:
 source=Our test results
 |file name=JanuaryData.csv
 |format=CSV
 |data=Test date=Date,Study size=Study size,Researchers=Researchers,Result details=Notes
}}

Below that, there would presumably be a call to #for_external_table or #display_external_table to display the resulting data.

Directory iteration[edit]

It is also possible to process all files, optionally, with names matching a mask, in a directory. For example, to get a table of all files within the "ExternalData/includes" directory, and the name of the class defined in each one, you could add the following to LocalSettings.php:

$wgExternalDataSources['classes']['path'] = "$wgExtensionDirectory/ExternalData/includes";

Then call the following:

 {{#get_file_data:
  source=classes
  |file name=*.php
  |format=text
  |regex=/^\s*(?<abstract>abstract)?\s*class\s+(?<class>\w+)(\s+extends\s*(?<extends>\w+))?/m
  |data=file=__file,abstract=abstract,class=class,base=extends
  }}

The external variable __file will hold the file name, relative to $wgExternalDataSources['classes']['path'].