Search code examples
rdfsparqlsemantic-weballegrograph

Strange SPARQL behavior using variables vs. IRI


I'm facing a strange behavior using Allegrograph 4.13

This is the data for test case

prefix : <http://example.com/example#> 

INSERT DATA {
:A rdfs:label "A" .
:A :hasProp :Prop1 .
:Prop1 :Key "1" .
:Prop1 :Value "AA" .

:B :hasProp :Prop2 .
:Prop2 :Key "1" .
:Prop2 :Value "AA" .

:C :hasProp :Prop3 .
:C :hasProp :Prop4 .
:Prop3 :Key "1" .
:Prop3 :Value "AA" .

:Prop4 :Key "2" .
:Prop4 :Value "BB" .
}

Given :A, I need to find resources that have exactly the same properties. That is, I want to find :B but not :C, because :C has one property more (Key "2" and Value "BB")

See also this question Find individuals in SPARQL based on other relations / Compare sets

The following query kindly provided by Joshua Taylor uses resource directly (:A) and does exactly what I want:

prefix : <http://example.com/example#> 

select ?other ?k ?v {
   :A    :hasProp [ :Key ?k ; :Value ?v ] .
   ?other :hasProp [ :Key ?k ; :Value ?v ] .
   filter not exists { 
     { :A :hasProp [ :Key ?kk ; :Value ?vv ] .
       filter not exists { ?other :hasProp [ :Key ?kk ; :Value ?vv ] .
       }
     }
     union
     {
      ?other :hasProp [ :Key ?kk ; :Value ?vv ] .
      filter not exists { :A :hasProp [ :Key ?kk ; :Value ?vv ] .
      }
   }
  }
 }

Answer:

 -------------------  
 |other|  k  | v  
 |A    | "1" | "AA"  
 |B    | "1" | "AA"  
 -------------------  

The second one is using a variable ?a, because I need to find :A first according to some criteria (rdfs:label in this example)

Query using variable ?a:

 prefix : <http://example.com/example#> 

select ?other ?k ?v {
   ?a rdfs:label "A" .
   ?a    :hasProp [ :Key ?k ; :Value ?v ] .
   ?other :hasProp [ :Key ?k ; :Value ?v ] .
   filter not exists { 
     { ?a :hasProp [ :Key ?kk ; :Value ?vv ] .
       filter not exists { ?other :hasProp [ :Key ?kk ; :Value ?vv ] .
       }
     }
     union
     {
      ?other :hasProp [ :Key ?kk ; :Value ?vv ] .
      filter not exists { ?a :hasProp [ :Key ?kk ; :Value ?vv ] .
      }
    }
   }
 }

returns

 -------------------  
 |other|  k  | v  
 |A    | "1" | "AA"  
 |B    | "1" | "AA"  
 |C    | "1" | "AA"  
 -------------------   

This query returns also :C which is wrong in my opinion.

Can anybody explain this behavior or verify this test case with other triple stores / SPARQL engines ?


Additional Tests

As per request in the comments, I added the prefix for rdfs and also substituted the blank nodes with variables. This seems to have no effect.

prefix : <http://example.com/example#> 
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>

select ?a ?pr1 ?pr2 ?other ?k ?v {
  ?a rdfs:label "A" .
  # bind (:A as ?a) .
  ?a    :hasProp ?pr1 .
  ?pr1 :Key ?k ; :Value ?v .
  ?other :hasProp ?pr2 .
  ?pr2 :Key ?k ; :Value ?v .

 filter not exists { 
   { ?a :hasProp ?pp1 .
     ?pp1 :Key ?kk ; :Value ?vv  .
  filter not exists { ?other :hasProp ?pp2 .
                     ?pp2 :Key ?kk ; :Value ?vv .
  }
}
union
{
 ?other :hasProp ?pp3 .
  ?pp3 :Key ?kk ; :Value ?vv .
 filter not exists { ?a :hasProp ?pp4 .
                    ?pp4 :Key ?kk ; :Value ?vv .
 }
 }
 }  
 }
a    pr1     pr2   other k       v  
A   Prop1   Prop1   A   "1"     "AA"  
A   Prop1   Prop2   B   "1"     "AA"  
A   Prop1   Prop3   C   "1"     "AA"  

If I use BIND (commented) instead of the line with rdfs:label it looks the same.


Solution

  • I think that you've found a bug in AllegroGraph. It seems like adding the ?a rdfs:label "A" should restrict the value of ?a to being :A, and that's the behavior we see with Jena.

    Jena:       VERSION: 2.11.0
    Jena:       BUILD_DATE: 2013-09-12T10:49:49+0100
    ARQ:        VERSION: 2.11.0
    ARQ:        BUILD_DATE: 2013-09-12T10:49:49+0100
    RIOT:       VERSION: 2.11.0
    RIOT:       BUILD_DATE: 2013-09-12T10:49:49+0100
    
    prefix : <http://example.com/example#> 
    prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    
    select ?other ?k ?v {
       ?a rdfs:label "A" .
       ?a    :hasProp [ :Key ?k ; :Value ?v ] .
       ?other :hasProp [ :Key ?k ; :Value ?v ] .
       filter not exists { 
         { ?a :hasProp [ :Key ?kk ; :Value ?vv ] .
           filter not exists { ?other :hasProp [ :Key ?kk ; :Value ?vv ] .
           }
         }
         union
         {
          ?other :hasProp [ :Key ?kk ; :Value ?vv ] .
          filter not exists { ?a :hasProp [ :Key ?kk ; :Value ?vv ] .
          }
       }
      }
     }
    
    ----------------------
    | other | k   | v    |
    ======================
    | :B    | "1" | "AA" |
    | :A    | "1" | "AA" |
    ----------------------
    

    It probably makes sense to come up with the minimal example that reproduces this behavior, and to submit a bug report.