Wikidata Query Service/Blank Node Skolemization

For information about skolemization in the RDF context please read RDF 1.1 Concepts - 3.5 Replacing Blank Nodes with IRIs.

= Why skolemizing the blank nodes? = As part of the work to improve the performance of the Wikidata Query Service update process we decided to go with a patch approach. In the same vein as what was proposed in rdf-patch or TurtlePatch the idea is to mutate the graph with a set of trivial INSERT DATA and DELETE DATA operations. This is where blank nodes can't be used within these operations because they are by nature unidentifiable. By skolemizing the blank nodes we give an identity to the blank nodes and allow to apply such mutations on any triple store.

= How does this affect my SPARQL query? =

Queries using isBlank
Queries using isBlank(?o) will stop functioning and have to be rewritten using the  function.

Must be rewritten with:

Queries using isIRI
The skolem form being an IRI the use of  might conflate SomeValue nodes. To eliminate possible ambiguities  can be used:

can be rewritten as:

Form of the skolem IRI in results
The form of the IRI will be compliant with the RDF recommendations for example:

These IRIs will now replace the t9283749 in the result sets.

instead of returning:

will return:

= Changes to the RDF model (RDF dumps and Special:EntityData) =

In order to limit the differences between what is served by Wikidata Query Service and the RDF representation of wikidata entities the RDF model used in the RDF dumps and Special:EntityData may change to include skolem IRIs instead of labelled blank nodes.

For example a statement including a SomeValue snak will be changed from:

to

The solemization function is trivial as it reuses the blank node label for the skolem IRI suffix. Note that blank node labels as generated by wikibase now allow to retain the identity of the blank node.

For consumers willing to stick to blank nodes semantic the function to generalize the skolemized graph is also trivial as all well known IRIs prefixed with  can be transformed back to blank nodes labelled with the suffix of the skolem IRIs.

In other words for:
 * G: a wikidata graph or subgraph containing properly labelled blank nodes
 * sk the skolemization function described above
 * unsk the function described above that transforms skolem IRIs back to blank nodes

It is guaranteed that G = unsk(sk(G)).

When this breaking change is applied (following proper announcement made to the wikidata mailing lists) RDF dumps and Special:Entity will start to emit G′ where G′ = sk(G).

The specification of the RDF model will be changed accordingly.