Search code examples
sparqlrdfsemantic-webowlontology

sparql check for existing of a property and give zero to the answer


This is my minimum data:

@prefix : <http://example.org/rs#>

:item :hasContext [:weight 0.1 ; :doNotRecommend true] , [:weight 0.2 ] .

:anotherItem :hasContext [:weight 0.4] , [ :weight 0.5 ] .

as you see, each item has one or more hasContext , the object of that hasContext is an instance that could have a doNotRecommed predicate.

What I want is that if one of these instances (that are object of a hasContext) contains the donNotRecommed, i want the whole sum to be zero. ** and by sum I mean the sum of the weight**, so in other words, if that property exist, ignore all the weights (either they were there or not), just put zero

My query

select ?item (SUM(?finalWeight) as ?summedFinalWeight) {
 ?item :hasContext ?context .
  optional 
  {
    ?context :doNotRecommend true .
    bind( 0 as ?cutWeight) 
  }
  optional
  {
    ?context :weight ?weight .
  } 
  bind ( if(bound(?cutWeight), ?cutWeight , if(bound(?weight), ?weight, 0.1) ) as ?finalWeight )
}
group by ?item

The result

enter image description here

look at the value for :item, it is 0.2 (i know the reason, it is because of 0.2 plus zero (and this zero is because the doNotRecommend is there) but i dont' know the solution, what I want is to have zero in the case of :item

(hint, i know that i can always run another query in an upper level of this query and solve it or i can solve it using filter not exist but i am looking to solve it in the same query, because what i should u is a minimal data, while in my ontology, getting that weight and these objects is a very long query

Update 1

This is my real query, the first part (before the union) checks if the users confirms to a context, the second part (after the union) checks if the user don't conform to a context and here i want to check if that context has a doNotRecommendOrNot . Please be sure that it is imporisslbe that two parts validate together

SELECT  ?item (SUM(?finalWeightFinal) AS ?userContextWeight)
WHERE
  { VALUES ?user { bo:ania }
    ?item  rdf:type  rs:RecommendableClass
    OPTIONAL
      {   { FILTER EXISTS { ?item  rdf:type  ?itemClass }
            ?item     rdf:type           rs:RecommendableClass .
            ?userContext  rdf:type       rs:UserContext ;
                      rs:appliedOnItems  ?itemClass ;
                      rs:appliedOnUsers  ?userClass
            FILTER EXISTS { ?user  rdf:type  ?userClass }
            OPTIONAL
              { ?userContext  rs:hasWeightIfContextMatched  ?weight }
            BIND(if(bound(?weight), ?weight, 0.2) AS ?finalWeight)
          }
        UNION
          { ?item     rdf:type           rs:RecommendableClass .
            ?userContext  rdf:type       rs:UserContext ;
                      rs:appliedOnItems  ?itemClass ;
                      rs:appliedOnUsers  ?userClass
            FILTER EXISTS { ?item  rdf:type  ?itemClass }
            FILTER NOT EXISTS { ?user  rdf:type  ?userClass }
            OPTIONAL
                #Here is the skip
              { ?userContext  rs:doNotRecommendInCaseNotMatch  true
                BIND(0 AS ?skip)
              }
            OPTIONAL
              { ?userContext  rs:hasWeightIfContextDoesNotMatch  ?weight }
            BIND(if(bound(?weight), ?weight, 0.1) AS ?finalWeight)
          }
      }
    BIND(if(bound(?finalWeight), ?finalWeight, 1) AS ?finalWeightFinal)
  }
GROUP BY ?item

Update 2

After the appreciate answer of @Joshua Taylor, I tried to applied his approach in the real case, but this time with adding filter !bound(?skip)

Here is the query

SELECT  ?item ?itemClass ?userContext ?skip ?finalWeight 
WHERE
  { #{ 
    in this block i just select the items that i want to calculate the user context to.
    } #
    OPTIONAL
      { FILTER EXISTS { ?item  rdf:type  ?itemClass }
        ?userContext  rdf:type       rs:UserContext ;
                  rs:appliedOnItems  ?itemClass ;
                  rs:appliedOnUsers  ?userClass
        OPTIONAL
          { ?userContext  rs:hasWeightIfContextMatched  ?weightMatched }
        OPTIONAL
          { ?userContext  rs:hasWeightIfContextDoesNotMatch  ?weightNotMatched }
        OPTIONAL
          { ?userContext  rs:doNotRecommendInCaseNotMatch  true
            BIND(1 AS ?skip)
          }
        BIND(if(EXISTS { ?user  rdf:type  ?userClass }, coalesce(?weightMatched, "default User Matched"), coalesce(?weightNotMatched, "default User not matched")) AS ?weight)
      }
    BIND(if(bound(?weight), ?weight, "no user context found for this item") AS ?finalWeight)
    FILTER ( ! bound(?skip) )
  }

It works with the data that I have but I just have a test data right now so i want to ask you if it is correct

update 3

my query generates these fields:

item skip ...

and the filter removes the rows that does have a binding for skip, but let's say that an item has two rows, like this:

item skip

A 1

A

A

so in my case i will just remove the first row, i need to know if i can remove the all rows for that item please.


Solution

  • There are lots of ways to do this; here's one that gets each item's sum weight, and then checks whether the item has a do not recommend flag, and if it does, uses 0 as the total weight:

    select ?item (if(bound(?skip), 0.0, ?sumWeight_) as ?sumWeight) {
      { select ?item (sum(?weight) as ?sumWeight_) where {
          ?item :hasContext/:weight ?weight .
        }
        group by ?item
      }
      bind(exists { ?item :hasContext/:doNotRecommend true } as ?skip)
    }
    
    ----------------------------
    | item         | sumWeight |
    ============================
    | :item        | 0.0       |
    | :anotherItem | 0.0       |
    ----------------------------
    

    Conceptually, this query checks once for each item whether any of its contexts mark it as non-recommendable. I think that's relatively efficient.

    On bind(exists { … } as ?skip)

    Note that combination of bind and exists. You already know how bind works, as you've used it plenty of times. bind(expr as ?variable) evaluates the expression expr and assigns it to the variable ?variable. You'd probably used exists and (not exists) in filter expressions before. exists { … } is true if the pattern inside the braces matches in the graph, and false otherwise. not exists { … } is similar, but reversed. The pattern

    ?item :hasContext/:doNotRecommend true
    

    is just shorthand, using a property path, for the pattern:

    ?item :hasContext ?something .
    ?something :doNotrecommend true .
    

    In this case, if that pattern exists, then we want to skip the item's sum weight and use zero instead.

    Alternative

    If you're willing to compute the sum for all the items, and then exclude those that have at least non-recommendable context, you can do that, too. The trick is just to figure out how to count the number of skips:

    select ?item (sum(?weight_) as ?weight){
      ?item :hasContext ?context .
      ?context :weight ?weight_ .
      bind(exists { ?context :doNotRecommend true } as ?skip)
    }
    group by ?item
    having (sum(if(?skip,1,0)) = 0)
    

    Considerations

    You mentioned that

    i know that i can always run another query in an upper level of this query and solve it or i can solve it using filter not exist but i am looking to solve it in the same query, because what i should u is a minimal data, while in my ontology, getting that weight and these objects is a very long query

    The solution above computes the sum weights first, and then decides which to use and which to discard. That means that there's some unnecessary computation. Your solution does something similar: it computes weights for contexts that don't have the :doNotRecommend property, even if some other context for the same item has a doNotRecommend property. If you really want to avoid the unnecessary computation, then you should figure out which items are recommendable first, and then compute scores for those, and figure out which items are not recommendable, and just return zero for those.

    It's easy to get a list of which items are which: something like

    select distinct ?item ?skip {
      ?item :hasContext ?anything .
      bind(exists{ :hasContext/:doNotRecommend true} as ?skip)
    }
    

    will do it just fine. However, since you'd want to do different things with the skippable and the non-skippable values, and that would probably take the form of a union of the two alternatives, and then you have the problem that you'd have to repeat the same subquery in each one. (Or use exists in one and not exists in the other, which is essentially repeating the same query.) It would get ugly pretty quickly. It might look something like this:

    select ?item ?weight {
      {
         #-- get non recommendable items and
         #-- set their weights to 0.0.
         select distinct ?item (0.0 as ?weight) {
           ?item :hasContext/:doNotRecommend true      #-- (*)
         }
      }
      union
      {
         #-- get recommendable items and
         #-- their aggregate weights
         select ?item (sum(?weight_) as ?weight) {
           #-- find the recommendable items
           { select distinct ?item {
               ?item :hasContext ?x .
               filter not exists { ?item :hasContext/:doNotRecommend true }   #-- (*)
             }
           }
           #-- and get their context's weights.
           ?item :hasContext/:weight ?weight_
         }
         group by ?item
      }
    }
    
    -------------------------
    | item         | weight |
    =========================
    | :item        | 0.0    |
    | :anotherItem | 0.9    |
    -------------------------
    

    The problem, in my opinion, is that the lines marked with (*) are really doing the same thing. The other computation doesn't happen multiple times, which is good, but we're still checking twice for each item whether it's recommendable or not.