Wikidata query service/Problematic queries

From MediaWiki.org
Jump to: navigation, search

This page lists queries which have some problems running in Blazegraph. Usually that means that we think they should be running fine but for one reason or another they do not. Note that this should not include queries that time out for legitimate reasons - like trying to scan repeatedly through whole database in 30 seconds or missing data triples, etc.

Buggy queries[edit]

Queries that do not return correct result, throw unexpected exceptions, etc.

Empty response[edit]

SELECT ?item WHERE {
  ?item wdt:P22 ?sub0 .
  ?tree0 (wdt:P40)* ?sub0 .
  BIND (wd:Q7210 AS ?tree0)
}

Try it! This query produces a completely empty response.

Empty response[edit]

The following query should give me the events that happened on the same day as today. It works though when I remove the label function & article URL lookup.

({SPARQL|query= SELECT distinct ?article ?event ?date WHERE

{ ?event wdt:P31/wdt:P279* wd:Q1190554. ?event wdt:P585 ?date. ?event rdfs:label ?eventLabel .
        ?article schema:about ?event .
   ?article schema:inLanguage "en" .
   ?article schema:isPartOf <https://en.wikipedia.org/>.
   SERVICE wikibase:label {
      bd:serviceParam wikibase:language "en" .
   }     

# events on the same day #FILTER (datatype(?date) = xsd:dateTime && month(?date) = month(now()) && day(?date) = day(now()) ) } LIMIT 5

})

This query gives me only a timeout.

Hard queries[edit]

These queries time out or take a lot of time despite the fact that they should be simple. Each query should be accompanied with an explanation why we think it is taking more time than it should.

Cities in Saxony[edit]

SELECT DISTINCT ?city WHERE {
  ?city wdt:P31/wdt:P279* wd:Q515 .  # find instances of subclasses of city
  ?city wdt:P131* wd:Q1202 .              # ...located in Saxony.
}

Try it!

workaround as of 2016-12-17, based on a solution for a related problem:

SELECT DISTINCT ?city WHERE { 
  {
    SELECT ?city WHERE {
      ?city wdt:P31/wdt:P279* wd:Q515 .
    }
  }
  ?city wdt:P131* wd:Q1202 .
}

Try it!

UNION + BIND[edit]

SELECT ?entity ?entityS ?image ?event (year(?date) as ?year) WHERE {  # afficher label puis lien wikidata

   {
      ?entityS wdt:P569 ?date .                           # date de naissance
     BIND ('naissance' AS ?event) .
   } UNION {                                              # OR
      ?entityS wdt:P570 ?date .                           # date de décès
      BIND ('décès' AS ?event) .
   } .
   {
      ?entityS wdt:P119 wd:Q311  .                        # cimetière du Père-Lachaise   
   } UNION {                                              # OR (claim[119:311,119:3006253])
      ?entityS wdt:P119 wd:Q3006253                       # columbarium du Père-Lachaise
   }          
   OPTIONAL { ?entityS wdt:P1442 ?image } .            # affiche l'image de la tombe si elle existe            
            
   SERVICE wikibase:label {
      bd:serviceParam wikibase:language "fr,en" .       # afficher label français sinon label anglais
      ?entityS rdfs:label ?entity
   } 
   FILTER (datatype(?date) = xsd:dateTime)      
   FILTER (month(?date) = month(now()))
   FILTER (day(?date) = day(now()))           
 }
ORDER BY ?date                                          # tri par date
LIMIT 100

Try it!

The query works fine if BIND statements are eliminated. BINDs should not add to query execution time since they are static data.

Number of items in class (BIND + PATH)[edit]

SELECT (COUNT(DISTINCT ?a) AS ?count) WHERE {
       BIND (wd:Q3305213 AS ?class) .    # paintings
       ?a wdt:P31/wdt:P279* ?class .
}

Try it!

As BIND is a constant here, it should not affect speed, however without bind the query is fast, but with BIND it times out.

Nested subqueries[edit]

SELECT ?q ?birth_date ?death_date ?count WHERE {
  {
    SELECT ?birth_date ?death_date (COUNT(*) AS ?count) WHERE {
       ?a wdt:P31 wd:Q5 .
       ?a wdt:P27 wd:Q41 . 

       ?a p:P569 ?birth_date_statement .
       ?birth_date_statement psv:P569 ?birth_date_node .
       ?birth_date_node wikibase:timeValue ?birth_date .

       ?a p:P570 ?death_date_statement .
       ?death_date_statement psv:P570 ?death_date_node .
       ?death_date_node wikibase:timeValue ?death_date.
    } 
    GROUP BY ?birth_date ?death_date
    HAVING (?count > 1)
	LIMIT 1
  }
  ?q wdt:P569 ?birth_date .
}
LIMIT 40

Try it!

Internal subquery finishes in 465 ms and by definition returns one result. However, whole query times out even though external query should be a very simple lookup.

Number of statements in the database by summing each wikibase:statements[edit]

SELECT (sum(?statmt) as ?sum) WHERE {
   [] wikibase:statements ?statmt
}

Try it! It was possible to get the result of this query a month ago

Now we reach timeout

Timeout statement is reached[edit]

So I am trying this semi-simple query and getting timeout. Any ideas what to do (other than limiting the query size)?

SELECT ?personLabel ?birthPlaceLabel ?coordinates

WHERE { 
    ?person wdt:P106 ?Q11063. 
    ?person wdt:P19 ?birthPlace. 
    ?birthPlace wdt:P625 ?coordinates .
    SERVICE wikibase:label {
        bd:serviceParam wikibase:language "en" .
    }
}

Try it!

Server error: Unexpected end of JSON input[edit]

A variation of the previous query, returns "Server error: Unexpected end of JSON input".

SELECT ?person ?birthPlace

WHERE { 
    ?person wdt:P106 ?Q4964182. 
    ?person wdt:P19 ?birthPlace
}

Try it!

Optimizer failures[edit]

These queries work well with optimizer disabled but not by default.

Aliases of properties which are used more than once[edit]

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?property ?alias ?occurences 
WHERE
{
  hint:Query hint:optimizer "None" .

  {
    SELECT ?alias (COUNT (?alias) as ?occurences) WHERE {	
      ?tmp a wikibase:Property .
      ?tmp skos:altLabel ?alias FILTER (LANG (?alias) = "en")
    } GROUP BY ?alias
  }

  ?property a wikibase:Property .
  ?property skos:altLabel ?alias FILTER (?occurences > 1) .
}
ORDER BY ?alias

Try it!

Deprecated statements[edit]

SELECT ?wdLabel ?wdPLabel ?valueLabel ?wd ?wdP ?wds ?value 
WHERE
{
  hint:Query hint:optimizer "None" .
  ?wds wikibase:rank wikibase:DeprecatedRank .
  ?wd ?p ?wds .
  ?wds ?v ?value .
  ?wdP wikibase:statementProperty ?v .
  ?wdP wikibase:claim ?p .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
}
LIMIT 100

Try it!

u

References to Le Figaro[edit]

SELECT ?statement ?subjectLabel ?subject ?propertyLabel ?property ?objectLabel ?object ?refURL WHERE {   
  hint:Query hint:optimizer "None" .   
  ?ref pr:P854 ?refURL    
  FILTER (CONTAINS(str(?refURL),'lefigaro.fr')) .    
  ?statement prov:wasDerivedFrom ?ref .   
  ?subject ?p ?statement . 
  ?property wikibase:claim ?p .
   ?property wikibase:statementProperty ?ps .
   ?statement ?ps ?object .
   SERVICE wikibase:label {
     bd:serviceParam wikibase:language "en" .   
   } 
}

Try it!

Most frequent properties by class[edit]

# Query to find the properties that most often connect to
# items in the class ?tgt_class = Q4167410
# Query generated by [[Template:InboundPropertiesToClass]]

 SELECT ?property ?propertyLabel ?count WHERE {
  hint:Query hint:optimizer "None" . 
  {
    SELECT ?p (COUNT(*) AS ?count) WHERE {
       BIND (wd:Q4167410 AS ?tgt_class) .
       ?b wdt:P31 ?tgt_class .
       ?a ?p ?b .
    } GROUP BY ?p LIMIT 100
  } . 
  ?property wikibase:directClaim ?p .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" .
  }
}
ORDER BY DESC(?count) # ?pLabel

Try it!

Human settlement within area of 80 km around London[edit]

#defaultView:Map
SELECT ?place ?placeLabel ?location
WHERE
{
  hint:Query hint:optimizer "None" .
  # Berlin coordinates
  wd:Q84 wdt:P625 ?singLoc . 
  SERVICE wikibase:around { 
      ?place wdt:P625 ?location . 
      bd:serviceParam wikibase:center ?singLoc . 
      bd:serviceParam wikibase:radius "80" . 
  } 
  # Is a human settlement
  ?place wdt:P31/wdt:P279* wd:Q486972 .
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "en" . 
  }
}

Try it!