Search code examples
sparqlrdfwikidata

Use ordered query results in another query without separate processing?


How can i access predicate + object pairs from a sorted result list for a new query by their line number?

Example query from Wikidata:

SELECT ?p ?o (COUNT (?o) as ?oCount) WHERE{
  ?s ?p ?o.                 #return every predicate + object...
  ?s wdt:P31 wd:Q5.         #...for humans
  ?s wdt:P19 wd:Q37100.     #...born in Auckland (NZL)
  ?s wdt:P21 wd:Q6581097.   #...which are male
}
GROUP BY ?p ?o ?oCount
ORDER BY DESC(?oCount)      #order by most common predicate/object combination

Link to this query.

These are some of the first results (edited for more clarity):

p                           o                                 oCount
--------------------------------------------------------------------
wdt:P19 (place of birth)    wd:Q37100 (Auckland)              1083
wdt:P21 (gender)            wd:Q6581097 (male)                1083
wdt:P31 (instance of)       wd:Q5 (human)                     1083
wdt:P27 (country)           wd:Q664 (New Zealand)             844
(...)
wdt:P106 (occupation)       wd:Q14373094 (rugby player)       202
(...)
wdt:P69 (educated at)       wd:Q492467 (Univ. of Auckland)    62

Now i want to do the following query:
Return all subjects with the nth predicate + object of this result list.

My actual use case is combining multiple rows of these reults with 'AND' or 'OR', e.g. return all subjects where the first 5 predicates+objects of this results are given.

Knowing how to access certain rows from the results would be a good start. I don't want to use a separate query by reading the results (with a script) and using them as input for a second query, if possible.


Solution

  • Knowing how to access certain rows from the results would be a good start. I don't want to use a separate query by reading the results (with a script) and using them as input for a second query, if possible.

    That's not really possible. You can use order by in a subquery, but the results aren't available in the outer query as an ordered list of results; they're just provided to the outer query. However, if you really want just the nth result, you can also use limit and offset, and that may be sufficient. For instance, you'd do something like the following (deliberately simplifying for the sake of example, and including sample data for completeness):

    @prefix : <urn:ex:>
    
    :p :hasRank 1 .
    :q :hasRank 2 .
    :r :hasRank 3 .
    :s :hasRank 4 .
    :t :hasRank 5 .
    :u :hasRank 6 .
    :v :hasRank 7 .
    :x :hasRank 8 .
    
    :a :s :s0, :s1, :s2 ;
       :t :t0, :t1 ;
       :u :u0, :u1, :u2, :u3 .
    
    prefix : <urn:ex:>
    
    select ?s ?p ?o {
      #-- find the fifth ?p by rank
      { select ?p {
          ?p :hasRank ?rank
        }
        order by ?rank
        limit 1
        offset 4
      }
    
      #-- get triples using ?p
      ?s ?p ?o .
    }
    
    -----------------
    | s  | p  | o   |
    =================
    | :a | :t | :t1 |
    | :a | :t | :t0 |
    -----------------