Search code examples
countsparqlrdf

SPARQL count unique value combinations


I have been working on a SPARQL query to find unique value combinations in my graph store. But I dont succeed.

Basically what I try to do is:

a b c
e f g
e r t
a b c
k l m
e f g
a b c


result:
a b c | 3
e f g | 2
e r t | 1
k l m | 1

Tried several constructions, with distincts, group by`s and sub queries but I dont succeed.

Last Try:

    SELECT  (count (*) as ?n){
      SELECT DISTINCT ?value1 ?value2 ?value3 WHERE {
        ?instance vocab:relate ?value1 .
        ?instance vocab:relate ?value2 .
        ?instance vocab:relate ?value3 .
      }
    }

RDF:

<http://test.example.com/instance1>
        a       <http://test.example.com#Instance> ;
        <http://vocab.example.com/relate>
                <http://test.example.com/c> , <http://test.example.com/b> , <http://test.example.com/a> .

<http://test.example.com/instance6>
        a       <http://test.example.com#Instance> ;
        <http://vocab.example.com/relate>
                <http://test.example.com/g> , <http://test.example.com/f> , <http://test.example.com/e> .

<http://test.example.com/instance4>
        a       <http://test.example.com#Instance> ;
        <http://vocab.example.com/relate>
                <http://test.example.com/c> , <http://test.example.com/b> , <http://test.example.com/a> .

<http://test.example.com/instance2>
        a       <http://test.example.com#Instance> ;
        <http://vocab.example.com/relate>
                <http://test.example.com/g> , <http://test.example.com/f> , <http://test.example.com/e> .

<http://test.example.com/instance7>
        a       <http://test.example.com#Instance> ;
        <http://vocab.example.com/relate>
                <http://test.example.com/c> , <http://test.example.com/b> , <http://test.example.com/a> .

<http://test.example.com/instance5>
        a       <http://test.example.com#Instance> ;
        <http://vocab.example.com/relate>
                <http://test.example.com/m> , <http://test.example.com/l> , <http://test.example.com/k> .

<http://test.example.com/instance3>
        a       <http://test.example.com#Instance> ;
        <http://vocab.example.com/relate>
                <http://test.example.com/t> , <http://test.example.com/r> , <http://test.example.com/e> .

Solution

  • AKSW's comment is spot on: you need to add an ordering criteria to the values so that you're not considering all the different possible ways of ordering the values. Also, remember that RDF doesn't have "duplicate" triples, so

    :a :p :c, :c, :d
    

    is the same as

    :a :p :c, :d
    

    so the appropriate comparison is < as opposed to <=, since without duplicate triples, you'd never have an = case. Also, since the values are IRIs, you need to get their string values before you can compare with <, but the str function will take care of that.

    prefix v: <http://vocab.example.com/>
    prefix : <http://test.example.com/>
    
    select ?a ?b ?c (count(distinct ?i) as ?count) where {
      ?i v:relate ?a, ?b, ?c .
      filter (str(?a) < str(?b) && str(?b) < str(?c))
    }
    group by ?a ?b ?c
    
    ------------------------
    | a  | b  | c  | count |
    ========================
    | :a | :b | :c | 3     |
    | :e | :f | :g | 2     |
    | :e | :r | :t | 1     |
    | :k | :l | :m | 1     |
    ------------------------