Requests for comment/Wikidata API

From mediawiki.org
Request for comment (RFC)
Wikidata API
Component General
Creation date
Author(s) Yuri Astrakhan
Document status declined
partially addressed by T72729

See Phabricator.
We implemented a pageterms query module in 2014-11 and run Extension:Wikibase Client on most wikipedias to resolve phab:T72729.

This proposal was abandoned due to lack of time by an implementor – would welcome a developer to take this over and reactivate it.

Introduction[edit]

This RFC proposes a slightly altered Wikidata API structure to make it more in line with the core API, as well as more feature rich and compact.

The goals of the RFC are

  • Minimalistic results - only retrieve the data that has been asked for, nothing else. This reduces server load, bandwidth costs, and improves speed.
  • Seamless integration - Wikibase extensions should be extending action=query when appropriate. This way all the other features of the query action becomes available, such as combining multiple pieces of information in one query (ask for aliases and what links here together), as well as query continuation.
  • Available locally - Should be queryable from the local wiki rather than having to make a cross-domain request.

action=wbgetentities[edit]

Current[edit]

In this action, the result looks similar to action=query, but they cannot be combined with other requests, cannot be paged (potentially causing PHP memory-limit overfill), and is very verbose, repeating the same information lots of times. This call will only work on the Wikidata Repository.

Result
{
    "entities": {
        "q7186": {
            "pageid": 8349,
            "ns": 0,
            "title": "Q7186",
            "lastrevid": 6415140,
            "modified": "2013-02-13T15:20:14Z",
            "id": "q7186",
            "type": "item",
            "aliases": {
                "it": [
                    {
                        "language": "it",
                        "value": "Maria Sk\u0142odowska"
                    }
                ],
                ...
            },
            "labels": {
                "en": {
                    "language": "en",
                    "value": "Marie Curie"
                },
                ...
            "descriptions": {
                "en": {
                    "language": "en",
                    "value": "French-Polish physicist and chemist"
                },
                ...
            "claims": {
                "p21": [ // sex=female
                    {
                        "id": "q7186$510A1BC5-9E7D-49F7-9B54-75D4D1B2F235",
                        "mainsnak": {
                            "snaktype": "value",
                            "property": "p21",
                            "datavalue": {
                                "value": {
                                    "entity-type": "item",
                                    "numeric-id": 43445
                                },
                                "type": "wikibase-entityid"
                            }
                        },
                        "type": "statement",
                        "rank": "normal"
                    }
                ],
                ...
            "sitelinks": {
                "afwiki": {
                    "site": "afwiki",
                    "title": "Marie Curie"
                },
                ...

Proposed[edit]

In the proposed change, the request is part of the regular query action, and it explicitly states what data items the client wants to receive. The request is slightly more complex because the search for the needed page is done by the generator, while prop=wikibase adds needed elements to the found items. Note that wbssitelinks parameter requires values in the proper enwiki:title format. Result is shown in API version 2 format.

Result
{
    "pages": [
        {
            "pageid": 8349,
            "ns": 0,
            "title": "Q7186",
            "lastrevid": 6415140,
            "modified": "2013-02-13T15:20:14Z",
            "id": "q7186",
            "wbpaliases": {
                "it": [ // Multiple aliases per language
                    "Maria Sk\u0142odowska",
                ],
                ...
            "wbplabels": { // One label per language
                "en": "Marie Curie",
                ...
            "wbpdescr": { // One description per language
                "en": "French-Polish physicist and chemist",
                ...
            "wbpsitelinks": { // One language link per site
                "afwiki": "Marie Curie",
                ...
            "claims": {  // <-- This section has not been reviewed yet, and might change
                "p21": [ // sex=female
                    {
                        "id": "q7186$510A1BC5-9E7D-49F7-9B54-75D4D1B2F235",
                        "mainsnak": {
                            "snaktype": "value",
                            "property": "p21",
                            "datavalue": {
                                "value": {
                                    "entity-type": "item",
                                    "numeric-id": 43445
                                },
                                "type": "wikibase-entityid"
                            }
                        },
                        "type": "statement",
                        "rank": "normal"
                    }
                ],
                ...

list=wbsearch (wbs)[edit]

Exactly one of these parameters is required:

  • wbssitelinks: Search using a list of site links in the site+name format, e.g. [[enwiki:Marie Curie]]. Multiple sites and languages are allowed. Single result per sitelink. Multiple sitelinks may resolve into the same item page.
  • wbsaliases: Search using a list of aliases in the language title format, e.g. [[en:Marie Curie]]. Multiple languages are allowed. Multiple results per alias. Multiple aliases may resolve into the same item page.
  • wbslabels: Search using a list of labels in the language title format, e.g. [[en:Marie Curie]]. Multiple languages are allowed. Multiple results per label. Multiple labels may resolve into the same item page.

This parameter can be used with any of the above:

  • wbsmapping: (Optional) Adds an extra result section wbsmap that contains a mapping of all searched terms to pageids. If this module is used as a generator, the section name will be gwbsmap.

The result of this module, unless used as a generator, is a list of found pages.

Result
{
    "wbsearch": [
        {
            "pageid": 34635826,
            "title": "Q12345"
        },
        {
            "pageid": 5487346,
            "title": "Q9874561"
        },
    ],
    "wbsmap": {
        "enwiki:Page1": 34635826, // In case of wbsaliases or wbslabels (multiple results),
        "ruwiki:Page2": 5487346,  // there will be a list instead of the single value
    }
}

prop=wikibase (wbp)[edit]

  • wbpprop: Which data to get from the Qnnn page (will ignore any non-wikibase pages). Available values are:
    • desc: Get the description given to the page in a specific language (per wbplang)
    • alias: Get the list of aliases given to the page in a specific language (per wbplang)
  • wbplang: One or more languages for wbpprop
  • wbpsitelink: List of global site ids (e.g. hewiki) to get the language links for

Usually this prop will be used together with generator=wbsearch, but assuming the pageid of the Q-page is known:

Result
{
    "pages": [
        {
            "pageid": 165890,
            "ns": 0,
            "title": "Q165194",
            "lastrevid": 5732337,
            "modified": "2013-02-03T19:45:28Z",
            "id": "q165194",
            "wbptype": "item",
            "wbpaliases": {
                "en": [
                    "Application programming interface",
                ],
                "ru": [
                    "API",
                    "\u0410\u041f\u0418",
                    ...
                ],
                ...
            },
            "wbplabels": {
                "zh": [
                    "\u5e94\u7528\u7a0b\u5e8f\u63a5\u53e3",
                    ...
                ],
                ...