Wikidata Query Service/Categories

Wikidata Query Service also provides access to category graph of all public wikis (except labswiki and labtestwiki).

Currently, the data is updated from the latest 1>#Data dumps|weekly dump. Updates happen each Monday.

Accessing the data
The data is stored in the Blazegraph database in  namespace. Currently, there is no GUI to access the category data, but SPARQL queries can be made against the namespace by using https://query.wikidata.org/bigdata/namespace/categories/sparql?query=SPARQL. This SPARQL endpoint works in the same way as the 1>Special:MyLanguage/Wikidata query service/User Manual#SPARQL endpoint|main WDQS SPARQL endpoint.

Note that while each wiki has its own data set, they are all stored in the same namespace.

[https://query.wikidata.org/bigdata/namespace/categories/sparql?query=PREFIX%20gas%3A%20%3Chttp%3A%2F%2Fwww.bigdata.com%2Frdf%2Fgas%23%3E%0Aprefix%20mediawiki%3A%20%3Chttps%3A%2F%2Fwww.mediawiki.org%2Fontology%23%3E%20%0A%0ASELECT%20%2A%20WHERE%20%7B%0ASERVICE%20gas%3Aservice%20%7B%0A%20%20%20%20%20gas%3Aprogram%20gas%3AgasClass%20%22com.bigdata.rdf.graph.analytics.BFS%22%20.%0A%20%20%20%20%20gas%3Aprogram%20gas%3AlinkType%20mediawiki%3AisInCategory%20.%0A%20%20%20%20%20gas%3Aprogram%20gas%3AtraversalDirection%20%22Reverse%22%20.%0A%20%20%20%20%20gas%3Aprogram%20gas%3Ain%20%3Chttps%3A%2F%2Fen.wikipedia.org%2Fwiki%2FCategory%3ADucks%3E.%20%23%20one%20or%20more%20times%2C%20specifies%20the%20initial%20frontier.%0A%20%20%20%20%20gas%3Aprogram%20gas%3Aout%20%3Fout%20.%20%23%20exactly%20once%20-%20will%20be%20bound%20to%20the%20visited%20vertices.%0A%20%20%20%20%20gas%3Aprogram%20gas%3Aout1%20%3Fdepth%20.%20%23%20exactly%20once%20-%20will%20be%20bound%20to%20the%20depth%20of%20the%20visited%20vertices.%0A%20%20%20%20%20gas%3Aprogram%20gas%3AmaxIterations%208%20.%20%23%20optional%20limit%20on%20breadth%20first%20expansion.%0A%20%20%7D%0A%7D%20ORDER%20BY%20ASC%28%3Fdepth%29&format=json Example query], providing subcategories of category Ducks on English Wikipedia:

Simpler query
Simpler form of the query above can be accessed with   service:

run it manually

This query produces three output values:


 * — the category found
 * — the depth for the category
 * — the parent category

Data format
The data about category describe its URL and the name, e.g.

Links between categories are represented as   relationship, e.g.:

Hidden categories have class  .

Prefixes
Prefix   is defined as <tvar|2>https://www.mediawiki.org/ontology</>. Full ontology can be found at <tvar|1>https://www.mediawiki.org/ontology/ontology.owl</>.

Dump header
Dump header contains information about the dump, e.g.:

Data dumps
Data dumps are stored in <tvar|1>https://dumps.wikimedia.org/other/categoriesrdf/</>. Full dumps are performed weekly. Each wiki has its own dump file.

<tvar|1>https://dumps.wikimedia.org/other/categoriesrdf/lastdump/</> stores timestamps of the last dump performed.

Updating
To update categories, the following can be used:


 * 1) Create categories namespace:
 * 2) Load data:

Adding wikis
For now, if you want some wiki added, please comment on the talk page. Exception is Commons, which has by far the largest set of categories and thus we decided not to cover it for now, until we ensure everything works as planned with smaller data sets.

TODO

 * Regular (daily) updates:
 * Support from the GUI
 * More wikis support
 * Hidden category support: