Talk:Wikidata Query Service/Blank Node Skolemization

About this board

Necessity for a breaking change

2 (talkcontribs)

This is a significant breaking change as it makes a change to the output of the WDQS, which is not only seen by users making queries to the service but also seen by others who receiver the outputs of these queries. These others may not even know that the output is from the WDQS. So the change needs a very strong justification, which is lacking here.

There are alternatives that do not require a breaking change. A fairly simple alternative is to use the blank node IDs in the dump to construct IRIs that are used in the WDQS but then transformed back into blank nodes before query results are presented to users. This gains the performance advantages of the proposed approach (minus a very small cost when preparing query results) without changing anything that is visible by either users of the RDF dumps or users of WDQS.

Why is the breaking change being proposed when there is this alternative?

Another alternative would be to upgrade Blazegraph to allow for deletion of blank nodes by ID.

DCausse (WMF) (talkcontribs)

We discussed the alternatives you suggest and from our point of view they are problematic because:

  • adding a layer to the blazegraph output to unskolemize the blank nodes seems confusing as it suggests that the triple store still holds blank nodes while it holds skolem IRIs (leaving aside implementation details that are also problematic in our point of view)
  • making blazegraph to retain blank node identity was investigated but proven difficult to actually use due to lack of full support.
Reply to "Necessity for a breaking change"
There are no older topics