Topic on Talk:Wikidata Query Service/User Manual/MWAPI

Preserve result order from MediaWiki API?

8
Summary by Smalyshev (WMF)

Implemented as: ?ordinal wikibase:apiOrdinal true .

Eloquence (talkcontribs)

When using wbsearchentities, like so:

https://www.wikidata.org/w/api.php?search=las&language=en&uselang=en&format=jsonfm&limit=25&action=wbsearchentities

the API returns the results ordered by by relevance. When obtaining these results via SPARQL and modifying them, the order is lost (example). This is understandable, but is there a way to preserve the original order, e.g., by transforming it into an ordinal for use by ORDER BY? If not, should the wbsearchentities API be modified to make it possible to obtain the score for each result?

The practical application here is to modify autocomplete results on-the-fly with a single query, which seems like a great use case for the MWAPI integration into the query service.

Smalyshev (WMF) (talkcontribs)

I'll look into it. Generally the SPARQL results are not ordered, but if they come from ordered source (e.g. MWAPI) it might be possible to preserve order maybe. I'll check.

Adding score should not be hard if the score is present in result's XML.

Eloquence (talkcontribs)

Thank you for taking a look! Unfortunately, the wbsearchentities API's XML output does not include a score that could be used for ordering. I think we'd either have to infer an ordinal from the sequence of results somehow, or perhaps optionally add the score to the output on the MediaWiki side.

Smalyshev (WMF) (talkcontribs)

Yes, the API of entity search does not allow for score currently :( And extracting ordinal number from XML seems non-trivial... I am not sure why results appear out of order - the service delivers them in order, but somewhere inside Blazegraph the order is lost. I'll look into why that happens.

Smalyshev (WMF) (talkcontribs)

It looks like the order breaks only when join (?item wdt:P31 ?instance) is applied... If you just call the service, the order is preserved. Which makes sense since joins are parallelized and do not guarantee preserving order. It then may be possible to just create simulated variable that returns ordinal - like "?position wikibase:apiOutput mwapi:ordinal" or something like that - for each result. That probably would allow to re-sort them after joins.

Eloquence (talkcontribs)

Something like that would be excellent, yes, and might help with other queries as well :)

Smalyshev (WMF) (talkcontribs)
Smalyshev (WMF) (talkcontribs)