Search code examples
sparqlrdf

How can I use SPARQL to find instances where two properties do not share any of the same objects?


In my company's taxonomy, all concepts have a value for skos:prefLabel, and most have a bunch of values for a custom property -- let's call it ex:keyword -- whose values have the datatype rdf:langString. I want to find concepts where the value of skos:prefLabel does NOT exactly match ANY value of ex:keyword, when case and language are ignored.

It is fairly straightforward to write a query for concepts where the values DO match, like this:

SELECT *
WHERE {
  ?concept skos:prefLabel ?label ;
    ex:keyword ?kw
  FILTER (lcase(str(?label)) = lcase(str(?kw)))
}

Where I'm getting tripped up is in trying to negate this.

Using != in the FILTER would just return a bunch of cases where ?label and ?kw don't match, which is not what I'm after.

What I would like is to be able to use a FILTER NOT EXISTS, but that's invalid with an expression like (?a = ?b); it only works with something like {?a ?b ?c}.

I suspect that there is a proper way to express FILTER NOT EXISTS (?a = ?b) in SPARQL, but I don't know what it is. Can someone help?


Solution

  • The trick is to put the triple pattern for matching the keyword in the FILTER NOT EXISTS, like so:

    SELECT *
    WHERE {
      ?concept skos:prefLabel ?label .
      FILTER NOT EXISTS { ?concept ex:keyword ?kw .
                          FILTER(lcase(str(?label)) = lcase(str(?kw))) 
                        }
    }
    

    This query says "I want all concepts with a preflabel such that the concept has no keyword value that matches that prefLabel".