Search code examples
query-optimizationsparqlwikidata

Retrieving all items that are instances or sub-instances or sub-sub-instances in a chain


I need a logic OR of all valid statements about an item or its parents (instance-parent or subclass-parent).

Example: Q957653 is a instance of Q3184121, and the last have ?item P17 Q155. So safisfies a chain Q957653 P31 Q3184121 P17 Q155... So I need something as

    ?item P17 Q155
    | ?item P31 ?x P17 Q155
    | ?item P31 ?x P31 ?y P17 Q155
    | ?item P279 ?x P17 Q155
    | ?item P279 ?x P31 ?y P17 Q155
    | ?item P279 ?x P279 ?y P17 Q155

A big logic "or" for all possible chains of P31 or P279 dependences.


Real example

I need a list of items that have some property (eg. ?item wdt:P402 _:b0.) and are instances or subclasses of items with other property, eg. wdt:P17 wd:Q155.

The "first level" of ?item wdt:P17 wd:Q155 is working fine,

SELECT DISTINCT ?item ?osm_relid ?itemLabel 
WHERE {
  ?item wdt:P402 _:b0.
  ?item wdt:P17 wd:Q155.
  OPTIONAL { ?item wdt:P1448 ?name. }
  OPTIONAL { ?item wdt:P402 ?osm_relid .}
  SERVICE wikibase:label { 
      bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]". 
  }
}

But how to express the union (or logic "or") of all other possible dependences?

Edit/Notes

Supposing that ?item wdt:P31*/wdt:P279* wd:Qxx . will be all "chain dependences of something Qxx", as I need... But Qxx is also a query,
?item wdt:P31*/wdt:P279* (?xx wdt:P17 wd:Q155) ..

... A solution (!) seems

SELECT  (COUNT(DISTINCT ?item) AS ?count) 
WHERE {
  ?item wdt:P402 _:b0.
  ?item  (wdt:P31*|wdt:P279*)/wdt:P17 wd:Q155 .
}

but I can't check because is time-consuming.

... Something perhaps near the "feasible solution" is
?item wdt:P31*/wdt:P279*/wdt:P31*/wdt:P17 wd:Q155 .
... after testing feasibles, seems wdt:P31*/wdt:P279*/wdt:P17 the only "optimal" with no time-out problem.


Solution

  • In order to improve performance, you can use Blazegraph query hints. Query hints allow to modify auto-generated query execution plan.

    SELECT DISTINCT ?item ?itemLabel ?osm_relid ?name {
      ?itemi wdt:P17 wd:Q155 .
      hint:Prior hint:runFirst true .
      ?item (wdt:P31|wdt:P279)* ?itemi .
      ?item wdt:P625 [].
      OPTIONAL { ?item wdt:P1448 ?name. }
      OPTIONAL { ?item wdt:P402 ?osm_relid .}
      SERVICE wikibase:label { 
          bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]". 
      }
    }
    

    Try it!

    This is how query execution plan looks like (just add &explain to the query URL and scroll down).

    Please note that you can't use hint:Prior hint:runLast true from the original comment when the label service is used: there can be only one such hint in any graph pattern group.