Search code examples
sparqlfusekinamed-graphs

Querying multiple graphs with aggregate Count and graph in results


Is it possible to count the occurrences of triples in multiple named graphs and return the results as rows in a table? Such as:

   ?g    ?count  ?sequence_count
-------- ------- ---------------
 graph1    54          54
 graph2    120         80

Here is the query that I tried.

SELECT ?g ?count ?sequence_count     
    FROM NAMED <graph1>
    FROM NAMED <graph2>
WHERE {
    { 
       select (COUNT(?identifier) as ?count) (COUNT(?sequence) as ?sequence_count) 
       WHERE { GRAPH ?g { 
           ?identifier a <http://www.w3.org/2000/01/rdf-schema#Resource> . 
           OPTIONAL { ?identifier <urn:sequence> ?sequence } 
       } }
    }
}

But the results were:

   ?g    ?count  ?sequence_count
-------- ------- ---------------
           174          134

I'm trying to avoid having to write out:

select ?count_graph1 ?sequence_count_graph1 ?count_graph2 ...

as there could be hundreds of graphs to query.


Solution

  • First, the query is really close. Just move the SELECT inside of the graph statement - basically stating 'for each graph, find these aggregate values'. Second, if any of the ?identifier matches have multiple values, the count for ?identifier will have duplicates, so DISTINCT results are necessary. Try the following:

    SELECT *
      FROM NAMED <graph1>
      FROM NAMED <graph2>
    WHERE {
       GRAPH ?g {
          SELECT (COUNT(DISTINCT ?identifier) as ?count) (COUNT(?sequence) as ?sequence_count)
          WHERE {
             ?identifier a <http://www.w3.org/2000/01/rdf-schema#Resource> . 
             OPTIONAL { ?identifier <urn:sequence> ?sequence }
          }
       }
    }