Extension:RDFIO

From MediaWiki.org
Jump to: navigation, search
MediaWiki extensions manual - list
Crystal Clear action run.png
RDF IO

Release status: beta

Description Extended RDF import and export capabilities in Semantic MediaWiki, including a fully PHP-based SPARQL Endpoint
Author(s) Samuel Lampa, Denny Vrandečić
Last version 0.5.0 (2010-09-17)
MediaWiki 1.16.* or greater
License GPL
Download Download snapshot
Subversion [Help]

Browse source code
View code changes

Check usage (experimental)

Contents

[edit] Current status: Incompatible with SMW 1.6 - [FIX IN PROGRESS]

UPDATE, 20111028: I'm now taking a week off from work to get RDFIO in a usable state. See this blog post for details!
// SHL 22:08, 28 October 2011 (UTC)

Caution! Caution: Unfotunately RDFIO is not yet compatible with the latest version (1.6) of Semantic MediaWiki. I'm trying to resolve these bugs though in the evenings (after work), so I hope I'll have it up working again within a week or two (hopefully). If you want to try RDFIO before that, I suggest using SMW 1.5, and a branched version of RDFIO (e.g. http://svn.wikimedia.org/svnroot/mediawiki/branches/REL1_17/extensions/RDFIO/), since the trunk version might break backwards compatibility to 1.5 while getting it work with 1.6.

[edit] Introduction

This extension extends the RDF import and export functionality in Semantic MediaWiki by providing import of arbitrary RDF triples (not only OWL ontologies, as was the case before), and a SPARQL endpoint that allows write operations.

Technically, RDFIO implements the PHP/MySQL based triple store (and its accompanying SPARQL Endpoint) provided by the ARC2 library. For updating wiki pages with new triples on import/sparql update, the SMWWriter extension is used (which in turn makes use of the Page Object Model extension).

The RDF import stores the original URI of all imported RDF entities (in a special property), which can later be used by the SPARQL endpoint, instead of SMW's internal URIs, which thus allows to expose the imported RDF data "in its original formats", with its original URIs. This allows to use SMW as a collaborative RDF editor, in workflows together with other semantic tools, from which it is then possible to "export, collaboratively edit, and import again", to/from SMW.

This extensions was developed as part of a Summer of Code 2010 project. The project description can be found here. See also the status page, with info on how you can follow the project.

Caution! Caution: This extension is not yet ready for production use! Use it on your own risk!

[edit] Demo

[edit] Alternative triple store connectors / SPARQL Endpoints

One of the features of RDFIO is to connect Semantic MediaWiki with a triple store, and to provide a SPARQL Endpoint. There are (already) a few extensions that offer this feature. See this page for an overview of triple store connector features. (The idea behind RDFIO is mainly focusing on the RDF import functionality, and merge of some or all of the extensions is being discussed).

[edit] Download

SVN Checkout URL (or browse):

http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/RDFIO/

[edit] Installation

[edit] About this instruction

Disclaimer: This manual currently mainly supports Ubuntu, but contributed hints on how to do the same things in Windows, we will gladly include.

Commandline code snippets for Ubuntu are now included in order to simplify things. These are not guaranteed to work or be the best option in all circumstances though! (See website of each respecive software/extension for full details)

[edit] Preliminary steps

[edit] Installing MediaWiki

This is if you don't have a Semantic MediaWiki installation:

  • Create a MySQL database and a user, with phpMyAdmin or similar.
  • To check out the MediaWiki 1.16 version, do:
cd wiki/
svn checkout http://svn.wikimedia.org/svnroot/mediawiki/tags/REL1_16_0/phase3/ .
chmod a+w config/
  • Go to your wiki (e.g. http://localhost/wiki) and enter the installation options, including the database and user that you created above.
  • Run the installation
  • In the commandline again, in the wiki/ folder:
mv config/LocalSettings.php .

[edit] Installing Semantic MediaWiki

cd wiki/extensions
sudo rm -r .svn

The last line is because we will check out another repository to this folder:

svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/SemanticMediaWiki/

Add the following lines (modify the base url of course, to something like "localhost", if you are running on your local machine) to your LocalSettings.php file:

/* SMW */
include_once("$IP/extensions/SemanticMediaWiki/SemanticMediaWiki.php");
# Make sure to configure this correctly, otherwise some
# functionality of the SPARQL endpoint will not work:
enableSemantics('your-server-domain-name');

/* Show the factbox on bottom of pages */
$smwgShowFactbox = SMW_FACTBOX_NONEMPTY;
  • Important: Make sure to configure the server domain name, above, correctly!
  • Log in to the wiki as a super user
  • Go to the article Special:SMWAdmin
  • Click "Initialise or upgrade tables"
  • Done!
  • (More info on download and installation)

[edit] Installing RDFIO

... assuming that you have a working Semantic MediaWiki installation (Tested with MW *1.16* and later).

It is STRONGLY recommended that you use the latest version from the svn trunk, since that is what this extension is tested against continually!

  • Install SMWWriter and PageObjectModel and RDFIO, according to instructions on this page, or run the following commands:
cd wiki/extensions
svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/SMWWriter/
svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/PageObjectModel
svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/extensions/RDFIO
  • Include lines for SMWWriter and PageObjectModel don't have to be added, since RDFIO adds them for you!

[edit] LocalSettings.php configuration

  • Add the following lines to LocalSettings.php:
/**************************
 *                        *
 *         RDF IO         * 
 *                        *
 **************************/

require_once("$IP/extensions/RDFIO/RDFIO.php");

### Configuration ######################################
# You may modify the following settings to your liking #
########################################################
$rdfiogUseNSPrefixInWikiTitleForProperties = false;
$rdfiogUseNSPrefixInWikiTitleForEntities = true;

# An associative array with base uris as keys and corresponding 
# prefixes as the items. Example:
# array( 
#       "http://example.org/someOntology#" => "ont1",
#       "http://example.org/anotherOntology#" => "ont2"
#      );
$rdfiogBaseURIs = array();


# Query by /Output Original URIs in SPARQL Endpoint 
# (overrides settings in HTML Form)
$rdfiogQueryByOrigURI = true;
$rdfiogOutputOrigURIs = true;

# Query by /Output Equivalent URIs SPARQL Endpoint 
# (overrides settings in HTML Form)
$rdfiogQueryByEquivURI = false;
$rdfiogOutputEquivURIs = false;

$rdfiogPropertiesToUseAsWikiTitle = array(
  'http://semantic-mediawiki.org/swivt/1.0#page',
  'http://www.w3.org/2000/01/rdf-schema#label',
  'http://purl.org/dc/elements/1.1/title',
  'http://www.w3.org/2004/02/skos/core#preferredLabel',
  'http://xmlns.com/foaf/0.1/name',
  'http://www.nmrshiftdb.org/onto#spectrumId'
);

# Needed in order to allow user defined properties for
# property articles (which is needed by the RDF import)
$smwgOWLFullExport = true;

# Allow edit operations via SPARQL from remote services
$rdfiogAllowRemoteEdit = false;
  • Optional: Edit the $rdfiogPropertiesToUseAsWikiTitle array according to your liking.
  • Download the ARC2 library from here, then extract and place it in `wiki/extensions/SemanticMediaWiki/libs/arc`, so that the file `ARC2.php` is placed in the `arc` folder. Command line syntax (using the git client (In Ubuntu, install with sudo apt-get install git, if you don't have it)):
 cd wiki/extensions/SemanticMediaWiki/libs
 git clone https://github.com/semsol/arc2.git
 mv arc2 arc
  • (RDFIO adds include lines for SMWWriter and ARC)
  • Log in to your wiki as a super user
  • Edit the main page and add the following wiki snippet, which will give you links to the main functionality with RDFIO:
* [[Special:ARC2Admin|ARC2Admin]]
* [[Special:RDFImport|RDFImport]]
* [[Special:SPARQLEndpoint|SPARQLEndpoint]]
  • Browse to http://[your-domain]/wiki/Special:ARC2Admin
  • Click the "Setup" button is to set up the database tables.
  • Note: If you already have semantic annotations in your wiki, you need to go to the article "Special:SMWAdmin" in your wiki, and click "Start updating data", and let it complete, in order for the data to be available in the SPARQL endpoint.
  • Create the article "MediaWiki:Smw_uri_blacklist" and make sure it is empty (you might need to add some nonsense content like {{{}}}).
  • Now, try adding some semantic data to wiki pages, and then check the database (using phpMyAdmin e.g.) to see if you get some triples in the table named `arc2store_triple`
  • Access the SPARQL endpoint at http://[url-to-your-wiki]/Special:SPARQLEndpoint
  • Access the RDF Import screen at http://[url-to-your-wiki]/Special:RDFImport

[edit] Additional steps

[edit] How to apply a patch in Windows

(Kindly contributed by Oleg Simakoff - osimakoff at gmail.com)

Patching Instructions for Windows (XP, Vista, 7) users. Note: the following instructions assume that you have installed TortoiseSVN Subversion Client.

  1. obtain a `[patch-filename].patch` file
  2. copy file into directory containing the files to be patched
  3. right-click on a `[patch-filename].patch` file to bring up a menu and select `[TortoiseSNV] | [Apply patch...]`
  4. your selection will bring up a `[TortoiseMerge]` window and `[File Patches]`
  5. right-click on a file you would like to patch to preview, patch selected or patch all files
  6. after patching is done, press `[Save]` to save your work.

[edit] See also

[edit] Dependencies

[edit] Use cases

[edit] Editing Semantic MediaWiki from Bioclipse

Chemists and biologists using Bioclipse can now take their working data and export it to a wiki where their peers can make corrections, before importing it again for further analysis, etc. This workflow is possible today, as hinted in this blog post / screencast, and is the focus of current research/development (progress documented on the blog) in the Bioclipse group of prof. Jarl Wikberg at Dept. of Pharm. biosciences, Uppsala University.

[edit] Bugs, new feature request and contact information

Please reports bugs and feature requests in Bugzilla.

Note: Please add my e-mail address, samuel.lampa[at]gmail.com as "assign-to" or "cc", so that I get notified, and add the string "[RDF]" to the subject line, in order to make the issues easy to collect.

General feedback can be given here on the talk page.

[edit] Change Log

  • 0.5.0 - 2010-09-17 - Numerous fixes to make remote SPARQL querying work (See repository updates for a list of all commits).
    • Improved file hierarchy
    • Made querying and output of/querying by Original URIs and Equivalent URIs configurable from LocalSettings.php in SPARQL endpoint (So this can be turned on for remote queries too)
    • In total five new configurable settings for LocalSettings.php (see here for full list):
      • $rdfiogQueryByOrigURI = true;
      • $rdfiogOutputOrigURIs = true;
      • $rdfiogQueryByEquivURI = false;
      • $rdfiogOutputEquivURIs = false;
      • $rdfiogAllowRemoteEdit = false;
    • Lots of serious bug fixes encountered when making SPARQL querying from Bioclipse/Jena work
  • 0.4.0 - 2010-08-16
    • Support for configuring extra namespace prefixes in LocalSettings.php
    • More options in RDF Import screen
    • Output SPARQL resultset as default for remote queries, and HTML for form queries
    • Enable output as Original URI/Equivalent URI also for XML Resultset output format
    • Refactorings (Merged EquivalentURIHandler and SMWBatchWriter classes, Broke out RDFIOPageHandler in separate file)
    • Many bugfixes
  • 0.3.0 - 2010-07-30 - Added output filtering options and other improvements.
    • Option to query by Equivalent URI
    • Refined SPARQL Endpoint screen
    • Option to output all Equivalent URIs (For RDF/XML format only)
    • Option to filter properties by ontology (when outputting equivalent URIs) by specified an URL to an OWL ontology definition. (For RDF/XML format only).
    • Much improved processing of SPARQL queries
    • Various refactoring
    • Fixed various bugs
      • Initialize query variable (r150)
      • Don't delete Original URI properties etc when deleting other facts (r151)
      • Fixed error in isURL check (r153)
  • 0.2.0 - 2010-07-20 - Important security improvements
    • Checking for appropriate user rights on all special pages
    • Improved code structure and comments
    • Various small fixes
  • 0.1.0 - 2010-07-21 - First release

[edit] See also

Personal tools
Namespaces
Variants
Actions
Site
Support
Download
Development
Communication
Print/export
Toolbox