Search code examples
sparqlrdfsemantic-webtriplestorepropertypath

Filter for property path in SPARQL 1.1


Is there any way to filter a query like

select ?x ?y  where {?x <http://relationship.com/wasRevisionOf>+ ?y }";

which has the below output for the provided dataset:

http://article.com/2-3 http://article.com/2-2
http://article.com/2-3 http://article.com/2-1
http://article.com/2-4 http://article.com/2-3
http://article.com/2-4 http://article.com/2-2
http://article.com/2-4 http://article.com/2-1
http://article.com/2-2 http://article.com/2-1
http://article.com/1-3 http://article.com/1-2
http://article.com/1-3 http://article.com/1-1
http://article.com/1-2 http://article.com/1-1

How can we filter the query, such that we remove all the results with a ?x value that equals to a ?y value in another result. By this, we will get

http://article.com/2-4 http://article.com/2-3
http://article.com/2-4 http://article.com/2-2
http://article.com/2-4 http://article.com/2-1

since the ?x value of all the other results occurs as a ?y value in another results.

Here is the dataset:

<http://article.com/1-3> <http://relationship.com/wasGeneratedBy> <http://edit.com/comment1-2> .
<http://article.com/1-3> <http://relationship.com/wasRevisionOf> <http://article.com/1-2> .
<http://edit.com/comment1-2> <http://relationship.com/used> <http://article.com/1-2> .
<http://edit.com/comment1-2> <http://relationship.com/wasAssociatedWith> <http://editor.com/user1-1> .

<http://article.com/1-2> <http://relationship.com/wasGeneratedBy> <http://edit.com/comment1-1> .
<http://article.com/1-2> <http://relationship.com/wasRevisionOf> <http://article.com/1-1> .
<http://edit.com/comment1-1> <http://relationship.com/used> <http://article.com/1-1> .
<http://edit.com/comment1-1> <http://relationship.com/wasAssociatedWith> <http://editor.com/user1-1> .

<http://article.com/2-4> <http://relationship.com/wasGeneratedBy> <http://edit.com/comment2-3> .
<http://article.com/2-4> <http://relationship.com/wasRevisionOf> <http://article.com/2-3> .
<http://edit.com/comment2-3> <http://relationship.com/used> <http://article.com/2-3> .
<http://edit.com/comment2-3> <http://relationship.com/wasAssociatedWith> <http://editor.com/user2-3> .

<http://article.com/2-3> <http://relationship.com/wasGeneratedBy> <http://edit.com/comment2-2> .
<http://article.com/2-3> <http://relationship.com/wasRevisionOf> <http://article.com/2-2> .
<http://edit.com/comment2-2> <http://relationship.com/used> <http://article.com/2-2> .
<http://edit.com/comment2-2> <http://relationship.com/wasAssociatedWith> <http://editor.com/user2-2> .

<http://article.com/2-2> <http://relationship.com/wasGeneratedBy> <http://edit.com/comment2-1> .
<http://article.com/2-2> <http://relationship.com/wasRevisionOf> <http://article.com/2-1> .
<http://edit.com/comment2-1> <http://relationship.com/used> <http://article.com/2-1> .
<http://edit.com/comment2-1> <http://relationship.com/wasAssociatedWith> <http://editor.com/user2-1> .

Solution

  • select ?x ?y  where {?x <http://relationship.com/wasRevisionOf>+ ?y }
    

    How can we filter the query, such that we remove all the results with a ?x value that equals to a ?y value in another result.

    The if the ?x value in one row is the ?y value in another, it means that there ?x was the object of some triple on the wasRevisionOf property. You can simply filter those out:

    select ?x ?y  where {
      ?x <http://relationship.com/wasRevisionOf>+ ?y
      filter not exists {
        ?something <http://relationship.com/wasRevisionOf> ?x
      }
    }
    

    This ensures that each value of ?x is "the beginning" of a chain.