MW2SPARQL

From mediawiki.org
Caution! Caution: This is an experimental project. Anything could change without notice.

MediaWiki 2 SPARQL is an experimental SPARQL endpoint for Wikimedia wikis working on top of their database. It currently supports English, French and German wikipedias. It is based on Ontop and rewrites SPARQL queries to SQL. The source code in on GitHub and the endpoint is available at: http://mw2sparql.toolforge.org/sparql

There is no user interface for this tool yet, you could use Wikidata Query Service instead, using the SERVICE <http://mw2sparql.toolforge.org/sparql> { ... } syntax.

The mapping is under work and is very partial and not stable.

Query examples[edit]

Retrieve some data available about the English Wikipedia SPARQL article:

PREFIX mw: <http://mw2sparql.toolforge.org/ontology#>
SELECT ?predicate ?object WHERE {
  SERVICE <http://mw2sparql.toolforge.org/sparql> {
    <https://en.wikipedia.org/wiki/SPARQL> ?predicate ?object .
  }
} LIMIT 100

Try it!

Retrieve some data about the English Wikipedia connected to the item about SPARQL:

PREFIX mw: <http://mw2sparql.toolforge.org/ontology#>
SELECT ?article ?predicate ?object WHERE {
  ?article schema:about wd:Q54871 ;
           schema:isPartOf <https://en.wikipedia.org/> .
  SERVICE <http://mw2sparql.toolforge.org/sparql> {
    ?article ?predicate ?object .
  }
} LIMIT 100

Try it!

Retrieve the items with an English Wikipedia article in the Query languages category.

PREFIX mw: <http://mw2sparql.toolforge.org/ontology#>
SELECT ?item ?itemLabel ?page WHERE {
  hint:Query hint:optimizer "None" .
  SERVICE <http://mw2sparql.toolforge.org/sparql> {
    ?page mw:inCategory <https://en.wikipedia.org/wiki/Category:Query_languages> .
  }
  ?page schema:about ?item .
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
} LIMIT 100

Try it!

Current data model[edit]

Pages[edit]

A MediaWiki page. It has the mw:Page type and contains a mw:pageId, a schema:name, is in a namespace stored in mw:pageNamespaceId. Its content model is provided by mw:pageContentModelId. Internal links to other pages are connected by mw:internalLinkTo, categories by mw:inCategory and included pages by mw:includesPage. If it is a redirection, the redirection target is available using mw:redirectsTo Its latest revision is connected throw mw:pageLatestRevision.

New pages have the type mw:NewPage and redirection pages the type mw:RedirectPage.

Example:

<https://en.wikipedia.org/wiki/SPARQL> a mw:Page ;
    mw:pageId "2574343"^^xsd:integer ;
    mw:pageNamespaceId "0"^^xsd:integer ;
    schema:name "SPARQL"@en ;
    mw:pageContentModelId "wikitext" ;
    mw:pageLatestRevision <https://en.wikipedia.org/revision/143302398> ;
    mw:inCategory <https://en.wikipedia.org/wiki/Category:Declarative_programming_languages>, <https://en.wikipedia.org/wiki/Category:Query_languages> ;
    mw:internalLinkTo <https://en.wikipedia.org/wiki/API> , <https://en.wikipedia.org/wiki/Database>  ;
    mw:includesPage <https://en.wikipedia.org/wiki/Module:Infobox> .
<https://en.wikipedia.org/wiki/Champollion> a mw:Page, mw:RedirectPage ;
    mw:pageId "2574343"^^xsd:integer ;
    mw:pageNamespaceId "0"^^xsd:integer ;
    schema:name "Champollion"@en ;
    mw:pageContentModelId "wikitext" ;
    mw:pageLatestRevision <https://en.wikipedia.org/revision/770096562> ;
    mw:inCategory <https://en.wikipedia.org/wiki/Category:Surnames>  ;
    mw:redirectsTo <https://en.wikipedia.org/wiki/Jean-François_Champollion> .

Revision[edit]

A version of a MediaWiki page. Has the type mw:Revision, an oldid stored by mw:revisionId, belongs to the page connected by mw:revisionOfPage, may have a previous revision provided by mw:parentRevision.

Example:

<https://en.wikipedia.org/revision/143302398> a mw:Revision , mw:MinorRevision ;
    mw:revisionOfPage <https://en.wikipedia.org/wiki/Champollion> ;
    schema:contentSize "14933"^^xsd:integer ;
    schema:comment "typo" ;
    mw:revisionId "136823481"^^xsd:integer ;
    mw:revisionUserName "Foo" .

Known issues[edit]

  • GROUP BY is not supported and this is unlikely to be fixed in the future.
  • property* and property+ constructions are not supported too.
  • The URL encoding of articles is different from the one used by the current version of the one used by the Wikidata Query Service. The WDQS is planning to move to the usual MediaWiki URL encoding soon.

See also[edit]

  • MWAPI - MediaWiki API Service allows to call out to MediaWiki API from SPARQL, and receive the results from inside the SPARQL query.