Project:Sandbox

attempts to find the most efficient way to execute a given query from among the different possible query execution plans. Blazegraph has a built-in query optimizer which often works well. However, sometimes it is not so successful; in such cases queries may need to be optimised manually.

Fixed values and ranges
Searching for fixed values, e. g., is the cheapest option in the query service, but searching for ranges of values is also usually pretty efficient, and often preferable to other options. For example, to look for people born in 1978, is not nearly as efficient as the following:

You can further optimize this by informing the query service that  is a range-safe predicate:

This tells the optimizer that  doesn’t mix dates, strings, integers or other data types, which simplifies the range comparison. This is almost always true on Wikidata, so the main reason not to use this optimizer hint all the time is that it’s a bit inconvenient. (It’s probably also not a good idea to use it when you’re working with unknown values.)

Property paths
The optimizer generally assumes property paths to be more expensive than simple value matches, and usually that’s correct: try to add some simple triples to reduce the amount of values that have to be checked against the path. However, sometimes paths can be fairly efficient, and it’s good to inform the optimizer of that fact. For example, consider the following query for all municipalities in the German state of Lower Saxony:

As written, this will time out because the query service will attempt to start with all (including subclasses) before limiting them to  (recursively). It’s much more efficient to do it the other way around:

You can also tell the optimizer in which direction to traverse a path. For example, say you want to test whether is a :

This takes several seconds because the optimizer decides to start at and walk the  links backwards. You can tell it to go the other way around, starting at, following one link and then walking the  link forwards:

This is much more efficient. To go the other way, use  instead of.

Order tweaking
We’ve already disabled the optimizer a few times, when it tried to rearrange the query the wrong way. You can also apply some more fine-grained control to the optimizer by informing it which joins it should run first or last, using the hints   or  . One important thing to keep in mind is that these control the first or last join , which is more or less the   between two triples – not a triple directly. In particular, you can’t put this hint after the first triple in a group, nor (I think) after a triple containing a property path.

See also the sections on named subqueries below for further options to control query execution order.

Label service
The label service ( ) can make queries much less efficient – when working with an expensive query, it’s often a good idea to disable the service at first, try to optimize the query as much as possible, and only then turn it back on. With some luck, the query will be efficient enough that the query service doesn’t cause it to time out now.

If the query works without the label service but results in a timeout when it is added, it can help to extract the rest of the query into a subquery, and only apply the label service at the very end. You can do this with a regular SPARQL subquery:

This can be particularly important if you are using   to get back a few quick results from a query that would otherwise be very expensive.

The label service tries to run last, which means Blazegraph tries to materialise all the results of a query before moving on to adding the labels, only then applying the LIMIT directive. This can be avoided if the expensive part of the query is put into a subquery, as above, and the LIMIT is placed on the subquery, so the labels are looked up only for those few results.

Another way to reduce usage of the label service is using <tvar|label> </>. To get all proteins and their labels:

will timeout because 1 million hits the limit for the label service. Instead get all proteins with labels:

Note that this will output only those items that have an english label, though.

Named subqueries
As mentioned below, the optimizer sometimes tries to optimize the whole query across subqueries, so putting part of the query into a subquery doesn’t always help. You can fix this by using a <tvar|blaze>BlazeGraph</> extension, called named subqueries:

Named subqueries are guaranteed to run only once, on their own, which is just what we want here.

Further uses of named subqueries
The above example only uses one named subquery, but you can use any number of them. This can be useful to impose a general order or structure on the query, while still letting the optimizer rearrange the instructions within each subquery as it knows best.

You can also disable the optimizer only in one named subquery, while still letting it optimize the rest of the query: for this, specify <tvar|opt1> </> in that subquery instead of the usual <tvar|opt2> </>

See url>User:TweetsFactsAndQueries/Queries/name_phrases</>|User:TweetsFactsAndQueries/Queries/name phrases for an example using this technique.

Named subqueries are also useful when you want to group results multiple times, or want to use the same subquery results more than once in different contexts – you can  the same named subquery into multiple other ones, if necessary (and it will be run only once). See url>User:TweetsFactsAndQueries/Queries/most_common_election_days</>|User:TweetsFactsAndQueries/Queries/most common election days for an example of this.

Automatic optimisation -- background overview
The basic approach to query optimisation is to try to get the solution set to be as small as possible as soon as possible in the query execution, to make the work needed for subsequent joins as small as it can be. A subsidiary goal can be to maximise the possibility of pipelining and parallelisation. This is what <tvar|blaze>Blazegraph</>'s built-in query optimiser tries to achieve.

For example, here is a query to return a list of U.S. Presidents and their spouses:

The details of how <tvar|blaze>Blazegraph</> has executed a particular query (and why) can be obtained via the query engine's <tvar|expl>EXPLAIN</> mode. This can be done by replacing the following at the start of the query-editor URL, <tvar|1>: </> with <tvar|2>: </> to send the query to the SPARQL endpoint directly (not via the query editor), with an extra <tvar|3> </> in the URL before the <tvar|4> </> string.

The resulting URL, sending the query above to the endpoint directly, with the 'explain' option, produces [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%20%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0A%0ASELECT%20%3Fpres%20%3FpresLabel%20%3Fspouse%20%3FspouseLabel%20WHERE%20%7B%0A%20%20%20%3Fpres%20wdt%3AP31%20wd%3AQ5%20.%0A%20%20%20%3Fpres%20wdt%3AP39%20wd%3AQ11696%20.%0A%20%20%20%3Fpres%20wdt%3AP26%20%3Fspouse%20.%0A%20%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20.%0A%20%20%20%7D%0A%20%7D</> this report on the query and its optimisation].

By default, a query would execute from top to bottom, with solution-set joins made in the order given. This is shown as the "Original AST" plan in the report. However, the query engine estimates (as of September 2015) that there are It therefore concludes that the most efficient way to run the query, rather than the order specified, is instead to first find the presidents (76), then see how many of them have spouses (51 solutions), then see how many are human (47 solutions), before sending this solution set to the labelling service. This rearrangement is typical of successful query optimisation.
 * 2,900,146 humans in the database with <tvar|315> </>,
 * 76 U.S. presidents (including fictional ones!) with <tvar|3911696> </>, and
 * 38,836 triples with the predicate <tvar|26></>.

A query that has difficulties
The following query attempts to find all statements referenced to the Le Figaro website, and return their subject, predicate and object. However as written it times out.

The reason for this can be investigated by looking at the [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%20%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0Aprefix%20prov%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2Fns%2Fprov%23%3E%20prefix%20pr%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Freference%2F%3E%0A%0ASELECT%20%3Fstatement%20%3Fsubject%20%3FsubjectLabel%20%3Fproperty%20%3FpropertyLabel%20%3Fobject%20%3FobjectLabel%20%3FrefURL%20WHERE%20%7B%0A%20%20%3Fstatement%20prov%3AwasDerivedFrom%20%3Fref%20.%0A%20%20%3Fref%20pr%3AP854%20%3FrefURL%20%0A%20%20FILTER%20(CONTAINS(str(%3FrefURL)%2C'lefigaro.fr'))%20.%20%20%20%20%20%20%20%0A%20%20%3Fsubject%20%3Fp%20%3Fstatement%20.%0A%20%20%3Fproperty%20wikibase%3Aclaim%20%3Fp%20.%0A%20%20%3Fproperty%20wikibase%3AstatementProperty%20%3Fps%20.%0A%20%20%3Fstatement%20%3Fps%20%3Fobject%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20.%0A%20%20%7D%0A%7D</> explain report] for the query.

It appears that the query optimiser is seduced by the low cardinality of the statements <tvar|1> </> and <tvar|2> </>, with only 1756 properties that take items as their objects, without anticipating that (i) the second condition will do nothing to reduce the solution set from the first condition; and (ii) these conditions will do very little to reduce the cardinality of the statement <tvar|3> </> (764,223,907), which the optimiser proposes to join before looking at <tvar|4> </> and then <tvar|5> </>.

The query optimiser may also be trying to defer testing the statement <tvar|1> </> for as long as possible, as this requires materialisation of actual string values, rather than proceeding solely with joins based on whether items exist in statements or not.

any examples for hand-ordered query?

Other optimisation modes
<tvar|blaze>Blazegraph</> offers a handful of other optimisation directives including the more expensive "runtime optimisation", based on using sampling to more precisely estimate the likely development in the size of the solution set by a particular sequence of joins; and a directive to run a particular join first.

However, examining the corresponding [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%20%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0Aprefix%20prov%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2Fns%2Fprov%23%3E%20prefix%20pr%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Freference%2F%3E%0A%0ASELECT%20%3Fstatement%20%3Fsubject%20%3FsubjectLabel%20%3Fproperty%20%3FpropertyLabel%20%3Fobject%20%3FobjectLabel%20%3FrefURL%20WHERE%20%7B%0A%20%20hint%3AQuery%20hint%3Aoptimizer%20%22Runtime%22%20.%0A%20%20%3Fstatement%20prov%3AwasDerivedFrom%20%3Fref%20.%0A%20%20%3Fref%20pr%3AP854%20%3FrefURL%20%0A%20%20FILTER%20(CONTAINS(str(%3FrefURL)%2C%27lefigaro.fr%27))%20.%0A%20%20%3Fsubject%20%3Fp%20%3Fstatement%20.%0A%20%20%3Fproperty%20wikibase%3Aclaim%20%3Fp%20.%0A%20%20%3Fproperty%20wikibase%3AstatementProperty%20%3Fps%20.%0A%20%20%3Fstatement%20%3Fps%20%3Fobject%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20.%0A%20%20%7D%0A%7D</> query report], it appears that runtime optimisation has proceeded with the same query sequence as the static optimisation, and similarly times out.

As for the "runFirst" directive, I have not yet been able to work out how to apply it.

Performance falls off a cliff, when the number in a group of interest exceeds a certain threshold
This arose in connection with a query to find humans with apparently matching dates of birth and death, for humans from a particular country


 * France (N=125,223):  -- [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20v%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20q%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20ps%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20psv%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2Fvalue%2F%3E%0A%0A%20%20%20%23%20Query%20to%20find%20any%20multiple%20humans%20with%20matching%20%0A%20%20%20%23%20day-specific%20dates%20of%20birth%20and%20dates%20of%20death%2C%0A%20%20%20%23%20subject%20to%20%3Fproperty%20%3D%20P27%20having%20%3Fvalue%20%3D%20Q142%0A%20%20%20%23%20Query%20generated%20by%20%5B%5BTemplate%3AMatchingBirthDeath%5D%5D%0A%0ASELECT%20%3Fbirth_date%20%3Fdeath_date%20(COUNT(DISTINCT%20%3Fa)%20AS%20%3Fcount)%20WHERE%20%7B%0A%20%20%20BIND%20(wdt%3AP27%20AS%20%3Fproperty)%20.%0A%20%20%20BIND%20(wd%3AQ142%20AS%20%3Fvalue)%20.%0A%20%20%20%3Fa%20wdt%3AP31%20wd%3AQ5%20.%0A%20%20%20%3Fa%20%3Fproperty%20%3Fvalue%20.%20%0A%20%20%0A%20%20%20%3Fa%20p%3AP569%20%3Fbirth_date_statement%20.%0A%20%20%20%3Fbirth_date_statement%20psv%3AP569%20%3Fbirth_date_node%20.%0A%20%20%20%3Fbirth_date_node%20wikibase%3AtimeValue%20%3Fbirth_date%20.%0A%20%20%20%3Fbirth_date_node%20wikibase%3AtimePrecision%20%2211%22%5E%5Exsd%3Ainteger%20.%0A%20%20%20%0A%20%20%20%3Fa%20p%3AP570%20%3Fdeath_date_statement%20.%0A%20%20%20%3Fdeath_date_statement%20psv%3AP570%20%3Fdeath_date_node%20.%0A%20%20%20%3Fdeath_date_node%20wikibase%3AtimeValue%20%3Fdeath_date.%0A%20%20%20%3Fdeath_date_node%20wikibase%3AtimePrecision%20%2211%22%5E%5Exsd%3Ainteger%20.%0A%0A%7D%20%0AGROUP%20BY%20%3Fbirth_date%20%3Fdeath_date%0AHAVING%20(%3Fcount%20%3E%201)%0AORDER%20BY%20DESC(%3Fcount)%20%3Fbirth_date%0A%23%20LIMIT%2010</> query report] (successful)


 * Germany (N=194,098):  [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20v%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20q%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20ps%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20psv%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2Fvalue%2F%3E%0A%0A%20%20%20%23%20Query%20to%20find%20any%20multiple%20humans%20with%20matching%20%0A%20%20%20%23%20day-specific%20dates%20of%20birth%20and%20dates%20of%20death%2C%0A%20%20%20%23%20subject%20to%20%3Fproperty%20%3D%20P27%20having%20%3Fvalue%20%3D%20Q183%0A%20%20%20%23%20Query%20generated%20by%20%5B%5BTemplate%3AMatchingBirthDeath%5D%5D%0A%0A%20%20%20%20SELECT%20%3Fbirth_date%20%3Fdeath_date%20(COUNT(DISTINCT%20%3Fa)%20AS%20%3Fcount)%20WHERE%20%7B%0A%20%20%20%20%20%20%20%3Fa%20wdt%3AP27%20wd%3AQ183%20.%20%0A%0A%20%20%20%20%20%20%20%3Fa%20p%3AP569%20%3Fbirth_date_statement%20.%0A%20%20%20%20%20%20%20%3Fbirth_date_statement%20psv%3AP569%20%3Fbirth_date_node%20.%0A%20%20%20%20%20%20%20%3Fbirth_date_node%20wikibase%3AtimeValue%20%3Fbirth_date%20.%0A%20%20%20%20%20%20%20%3Fbirth_date_node%20wikibase%3AtimePrecision%20%2211%22%5E%5Exsd%3Ainteger%20.%0A%0A%20%20%20%20%20%20%20%3Fa%20p%3AP570%20%3Fdeath_date_statement%20.%0A%20%20%20%20%20%20%20%3Fdeath_date_statement%20psv%3AP570%20%3Fdeath_date_node%20.%0A%20%20%20%20%20%20%20%3Fdeath_date_node%20wikibase%3AtimeValue%20%3Fdeath_date.%0A%20%20%20%20%20%20%20%3Fdeath_date_node%20wikibase%3AtimePrecision%20%2211%22%5E%5Exsd%3Ainteger%20.%0A%0A%20%20%20%20%20%20%20%3Fa%20wdt%3AP31%20wd%3AQ5%20.%0A%0A%20%20%20%20%7D%20%0A%20%20%20%20GROUP%20BY%20%3Fbirth_date%20%3Fdeath_date%0A%20%20%20%20HAVING%20(%3Fcount%20%3E%201)%0A%20%20%20%20ORDER%20BY%20DESC(%3Fcount)%20%3Fbirth_date%20%3Fdeath_date%0A%23LIMIT%2040</> query report] (times out)

What goes wrong here occurs when the number of items with the given value of, eg 194,098 for Germany, exceeds the items (time-value nodes in fact) with  (169,902).

For France, with N=125,223, there is a steady fall-off in the size of solution set: requiring a date of death, then day-specific precision, then a date of birth, then day-specific precision, knocks this number down to 48,537, which is then reduced to 10 by requiring the date of birth and the date of death to match (a very quick operation).

However for Germany, with N=194,098, the query optimiser attempts to start with the 169,902 date-nodes with day-specific dates; but the query then explodes when the system tries to match these to all the people with such day-specific birthdates (1,560,806), a number which reduces when the nationality filter kicks in; but which still eats up the available query time.

(In fact, having looked at the performance data and seen just how fast <tvar|blaze>Blazegraph</> can do all-against-all matching for really very big sets, it turns out that (with a query rewrite) the system is fast enough to run this kind of query for everything v everything -- see <tvar|url> </> -- by matching on date-nodes first, which knocks down the size of the solution set, and not checking until after that the requirement that the date be day-specific).

Adding a names and name-labels wrapper to a query for life events makes it time out
Adding an apparently trivial wrapper, to find the name and name-label, causes the query in the immediately previous section to time out, even when without the wrapper it ran with no difficulty.


 * Query as presented:  -- [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20v%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20q%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20ps%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20psv%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2Fvalue%2F%3E%0A%0ASELECT%20%3Fq%20%3FqLabel%20%3Fbirth_date%20%3Fdeath_date%20%3Fcount%20WHERE%20%7B%0A%20%20%3Fq%20wdt%3AP31%20wd%3AQ5%20.%0A%20%20%3Fq%20wdt%3AP27%20wd%3AQ142%20.%20%0A%20%20%3Fq%20wdt%3AP569%20%3Fbirth_date%20.%0A%20%20%3Fq%20wdt%3AP570%20%3Fdeath_date%20.%20%20%20%0A%20%20%7B%0A%20%20%20%20SELECT%20%3Fbirth_date%20%3Fdeath_date%20(COUNT(*)%20AS%20%3Fcount)%20WHERE%20%7B%0A%20%20%20%20%20%20%20%3Fa%20wdt%3AP31%20wd%3AQ5%20.%0A%20%20%20%20%20%20%20%3Fa%20wdt%3AP27%20wd%3AQ142%20.%20%0A%0A%20%20%20%20%20%20%20%3Fa%20p%3AP569%20%3Fbirth_date_statement%20.%0A%20%20%20%20%20%20%20%3Fbirth_date_statement%20psv%3AP569%20%3Fbirth_date_node%20.%0A%20%20%20%20%20%20%20%3Fbirth_date_node%20wikibase%3AtimeValue%20%3Fbirth_date%20.%0A%20%20%20%20%20%20%20%3Fbirth_date_node%20wikibase%3AtimePrecision%20%2211%22%5E%5Exsd%3Ainteger%20.%0A%0A%20%20%20%20%20%20%20%3Fa%20p%3AP570%20%3Fdeath_date_statement%20.%0A%20%20%20%20%20%20%20%3Fdeath_date_statement%20psv%3AP570%20%3Fdeath_date_node%20.%0A%20%20%20%20%20%20%20%3Fdeath_date_node%20wikibase%3AtimeValue%20%3Fdeath_date.%0A%20%20%20%20%20%20%20%3Fdeath_date_node%20wikibase%3AtimePrecision%20%2211%22%5E%5Exsd%3Ainteger%20.%0A%0A%20%20%20%20%7D%20%0A%20%20%20%20GROUP%20BY%20%3Fbirth_date%20%3Fdeath_date%0A%20%20%20%20HAVING%20(%3Fcount%20%3E%201)%0A%20%20%7D%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20.%0A%20%20%7D%0A%7D%0AORDER%20BY%20DESC(%3Fcount)%20%3Fbirth_date%20%3Fdeath_date%0ALIMIT%2040%0A%0A</> query report]


 * Hand-ordered query:  -- [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20v%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20q%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20ps%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20psv%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2Fvalue%2F%3E%0A%0ASELECT%20%3Fq%20%3FqLabel%20%3Fbirth_date%20%3Fdeath_date%20%3Fcount%20WHERE%20%7B%0A%20%20hint%3AQuery%20hint%3Aoptimizer%20%22None%22%20.%20%0A%20%20%7B%0A%20%20%20%20SELECT%20%3Fbirth_date%20%3Fdeath_date%20(COUNT(*)%20AS%20%3Fcount)%20WHERE%20%7B%0A%20%20%20%20%20%20%20%3Fa%20wdt%3AP27%20wd%3AQ142%20.%20%0A%0A%20%20%20%20%20%20%20%3Fa%20p%3AP569%20%3Fbirth_date_statement%20.%0A%20%20%20%20%20%20%20%3Fbirth_date_statement%20psv%3AP569%20%3Fbirth_date_node%20.%0A%20%20%20%20%20%20%20%3Fbirth_date_node%20wikibase%3AtimeValue%20%3Fbirth_date%20.%0A%20%20%20%20%20%20%20%3Fbirth_date_node%20wikibase%3AtimePrecision%20%2211%22%5E%5Exsd%3Ainteger%20.%0A%0A%20%20%20%20%20%20%20%3Fa%20p%3AP570%20%3Fdeath_date_statement%20.%0A%20%20%20%20%20%20%20%3Fdeath_date_statement%20psv%3AP570%20%3Fdeath_date_node%20.%0A%20%20%20%20%20%20%20%3Fdeath_date_node%20wikibase%3AtimeValue%20%3Fdeath_date.%0A%20%20%20%20%20%20%20%3Fdeath_date_node%20wikibase%3AtimePrecision%20%2211%22%5E%5Exsd%3Ainteger%20.%0A%0A%20%20%20%20%20%20%20%3Fa%20wdt%3AP31%20wd%3AQ5%20.%0A%0A%20%20%20%20%7D%20%0A%20%20%20%20GROUP%20BY%20%3Fbirth_date%20%3Fdeath_date%0A%20%20%20%20HAVING%20(%3Fcount%20%3E%201)%0A%20%20%7D%0A%20%20%3Fq%20wdt%3AP569%20%3Fbirth_date%20.%0A%20%20%3Fq%20wdt%3AP570%20%3Fdeath_date%20.%20%20%20%0A%20%20%3Fq%20wdt%3AP27%20wd%3AQ142%20.%20%0A%20%20%3Fq%20wdt%3AP31%20wd%3AQ5%20.%0A%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20.%0A%20%20%7D%0A%7D%0AORDER%20BY%20DESC(%3Fcount)%20%3Fbirth_date%20%3Fdeath_date%0ALIMIT%2040%0A%0A</> query report] (successful)


 * Query without the name lookup:  -- [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20v%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20q%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20ps%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20psv%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2Fvalue%2F%3E%0A%0A%20%20%20%23%20Query%20to%20find%20any%20multiple%20humans%20with%20matching%20%0A%20%20%20%23%20day-specific%20dates%20of%20birth%20and%20dates%20of%20death%2C%0A%20%20%20%23%20subject%20to%20%3Fproperty%20%3D%20P27%20having%20%3Fvalue%20%3D%20Q142%0A%20%20%20%23%20Query%20generated%20by%20%5B%5BTemplate%3AMatchingBirthDeath%5D%5D%0A%0ASELECT%20%3Fbirth_date%20%3Fdeath_date%20(COUNT(DISTINCT%20%3Fa)%20AS%20%3Fcount)%20WHERE%20%7B%0A%20%20%20BIND%20(wdt%3AP27%20AS%20%3Fproperty)%20.%0A%20%20%20BIND%20(wd%3AQ142%20AS%20%3Fvalue)%20.%0A%20%20%20%3Fa%20wdt%3AP31%20wd%3AQ5%20.%0A%20%20%20%3Fa%20%3Fproperty%20%3Fvalue%20.%20%0A%20%20%0A%20%20%20%3Fa%20p%3AP569%20%3Fbirth_date_statement%20.%0A%20%20%20%3Fbirth_date_statement%20psv%3AP569%20%3Fbirth_date_node%20.%0A%20%20%20%3Fbirth_date_node%20wikibase%3AtimeValue%20%3Fbirth_date%20.%0A%20%20%20%3Fbirth_date_node%20wikibase%3AtimePrecision%20%2211%22%5E%5Exsd%3Ainteger%20.%0A%20%20%20%0A%20%20%20%3Fa%20p%3AP570%20%3Fdeath_date_statement%20.%0A%20%20%20%3Fdeath_date_statement%20psv%3AP570%20%3Fdeath_date_node%20.%0A%20%20%20%3Fdeath_date_node%20wikibase%3AtimeValue%20%3Fdeath_date.%0A%20%20%20%3Fdeath_date_node%20wikibase%3AtimePrecision%20%2211%22%5E%5Exsd%3Ainteger%20.%0A%0A%7D%20%0AGROUP%20BY%20%3Fbirth_date%20%3Fdeath_date%0AHAVING%20(%3Fcount%20%3E%201)%0AORDER%20BY%20DESC(%3Fcount)%20%3Fbirth_date%0A%23%20LIMIT%2010</> query report] (successful)

With the outer-layer in place, the inner layer now starts (even for France) by extracting the birth-nodes that are connected to birth-dates, and then keeping those which have precision=11.

The situation can be fixed by turning off the query optimiser and hand-ordering the query.

It seems this may be a general feature of queries with sub-select clauses: both here and in the next query the optimiser seems to want to start the sub-query with a requirement that includes one of the variable that will be projected to the outer layer.

Adding a property-labels wrapper to a query for properties makes it time out

 * Query as presented:  -- [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20v%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20q%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20ps%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20psv%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2Fvalue%2F%3E%0A%0A%20%20%23%20Query%20to%20find%20the%20properties%20that%20most%20often%20connect%20to%0A%20%20%23%20items%20in%20the%20class%20%3Ftgt_class%20%3D%20Q4167410%0A%20%20%23%20Query%20generated%20by%20%5B%5BTemplate%3AInboundPropertiesToClass%5D%5D%0A%0A%20SELECT%20%3Fproperty%20%3FpropertyLabel%20%3Fcount%20WHERE%20%7B%0A%20%20%3Fproperty%20wikibase%3AdirectClaim%20%3Fp%0A%20%20%7B%0A%20%20%20%20SELECT%20%3Fp%20(COUNT(*)%20AS%20%3Fcount)%20WHERE%20%7B%0A%20%20%20%20%20%20%20%3Fa%20%3Fp%20%3Fb%20.%0A%20%20%20%20%20%20%20%3Fb%20wdt%3AP31%20%3Ftgt_class%20.%0A%20%20%20%20%20%20%20BIND%20(wd%3AQ4167410%20AS%20%3Ftgt_class)%0A%20%20%20%20%7D%20GROUP%20BY%20%3Fp%20LIMIT%20100%0A%20%20%7D%20.%20%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20.%0A%20%20%7D%0A%7D%0A%23%20HAVING%20(%3Fcount%20%3E%201)%0AORDER%20BY%20DESC(%3Fcount)%20%23%20%3FpLabel%0A%23%20LIMIT%20100</> query report]


 * Hand-ordered query:  -- [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20v%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20q%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20ps%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20psv%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2Fvalue%2F%3E%0A%0A%20%20%23%20Query%20to%20find%20the%20properties%20that%20most%20often%20connect%20to%0A%20%20%23%20items%20in%20the%20class%20%3Ftgt_class%20%3D%20Q4167410%0A%20%20%23%20Query%20generated%20by%20%5B%5BTemplate%3AInboundPropertiesToClass%5D%5D%0A%0A%20SELECT%20%3Fproperty%20%3FpropertyLabel%20%3Fcount%20WHERE%20%7B%0A%20%20hint%3AQuery%20hint%3Aoptimizer%20%22None%22%20.%20%0A%20%20%7B%0A%20%20%20%20SELECT%20%3Fp%20(COUNT(*)%20AS%20%3Fcount)%20WHERE%20%7B%0A%20%20%20%20%20%20%20BIND%20(wd%3AQ4167410%20AS%20%3Ftgt_class)%20.%0A%20%20%20%20%20%20%20%3Fb%20wdt%3AP31%20%3Ftgt_class%20.%0A%20%20%20%20%20%20%20%3Fa%20%3Fp%20%3Fb%20.%0A%20%20%20%20%7D%20GROUP%20BY%20%3Fp%20LIMIT%20100%0A%20%20%7D%20.%20%0A%20%20%3Fproperty%20wikibase%3AdirectClaim%20%3Fp%20.%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20.%0A%20%20%7D%0A%7D%0A%23%20HAVING%20(%3Fcount%20%3E%201)%0AORDER%20BY%20DESC(%3Fcount)%20%23%20%3FpLabel%0A%23%20LIMIT%20100</> query report] (successul)


 * Query without property-label lookup: <tvar|url> </> -- [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20v%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20q%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20ps%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20psv%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2Fvalue%2F%3E%0A%0A%20%20%23%20Query%20to%20find%20the%20properties%20that%20most%20often%20connect%20to%0A%20%20%23%20items%20in%20the%20class%20%3Ftgt_class%20%3D%20Q4167410%0A%20%20%23%20Query%20generated%20by%20%5B%5BTemplate%3AInboundPropertiesToClass%5D%5D%0A%0A%23%20SELECT%20%3Fp%20%3FpLabel%20%3Fcount%20WHERE%20%7B%0A%23%20%20%3Fproperty%20wikibase%3AdirectClaim%20%3Fp%0A%23%20%20%7B%0A%20%20%20%20SELECT%20%3Fp%20(COUNT(*)%20AS%20%3Fcount)%20WHERE%20%7B%0A%20%20%20%20%20%20%20%3Fa%20%3Fp%20%3Fb%20.%0A%20%20%20%20%20%20%20%3Fb%20wdt%3AP31%20%3Ftgt_class%20.%0A%20%20%20%20%20%20%20BIND%20(wd%3AQ4167410%20AS%20%3Ftgt_class)%0A%20%20%20%20%7D%20GROUP%20BY%20%3Fp%0A%23%20%20%7D%20.%20%0A%23%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%23%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20.%0A%23%20%20%7D%0A%23%7D%0A%23%20HAVING%20(%3Fcount%20%3E%201)%0AORDER%20BY%20DESC(%3Fcount)%20%23%20%3FpLabel%0ALIMIT%20100</> query report] (successful)

Same as the above: for the query including property look-ups, the optimiser apparently wants to start with all triples <tvar|1> </> and then join <tvar|2> </>, rather than the other way round (which is what it manages to successfully do for the smaller query with no property look-up).

This doesn't appear to an artifact caused by BIND, nor the order the clauses are presented -- even if these are changed, the query still runs slow with the built-in optimiser


 * Alt query: <tvar|url> </> -- [<tvar|1>https://query.wikidata.org/bigdata/namespace/wdq/sparql?explain&query=PREFIX%20wd%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fentity%2F%3E%0APREFIX%20wdt%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%3E%0APREFIX%20wikibase%3A%20%3Chttp%3A%2F%2Fwikiba.se%2Fontology%23%3E%0APREFIX%20p%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2F%3E%0APREFIX%20v%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20q%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fqualifier%2F%3E%0APREFIX%20rdfs%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0APREFIX%20ps%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2F%3E%0APREFIX%20psv%3A%20%3Chttp%3A%2F%2Fwww.wikidata.org%2Fprop%2Fstatement%2Fvalue%2F%3E%0A%0A%20%20%23%20Query%20to%20find%20the%20properties%20that%20most%20often%20connect%20to%0A%20%20%23%20items%20in%20the%20class%20%3Ftgt_class%20%3D%20Q4167410%0A%20%20%23%20Query%20generated%20by%20%5B%5BTemplate%3AInboundPropertiesToClass%5D%5D%0A%0A%20SELECT%20%3Fproperty%20%3FpropertyLabel%20%3Fcount%20WHERE%20%7B%0A%20%20%3Fproperty%20wikibase%3AdirectClaim%20%3Fp%0A%20%20%7B%0A%20%20%20%20SELECT%20%3Fp%20(COUNT(*)%20AS%20%3Fcount)%20WHERE%20%7B%0A%20%20%20%20%20%20%20%3Fb%20wdt%3AP31%20wd%3AQ4167410%20.%0A%20%20%20%20%20%20%20%3Fa%20%3Fp%20%3Fb%20.%0A%20%20%20%20%7D%20GROUP%20BY%20%3Fp%20LIMIT%20100%0A%20%20%7D%20.%20%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20.%0A%20%20%7D%0A%7D%0A%23%20HAVING%20(%3Fcount%20%3E%201)%0AORDER%20BY%20DESC(%3Fcount)%20%23%20%3FpLabel%0A%23%20LIMIT%20100</> query report] (still fails)

It is still preferring to start with the assertion <tvar|1> </>, even though this is less specific -- apparently because <tvar|p> </> is one of the variables projected out of the inner SELECT statement.

More... ?
Add further cases here