Search code examples
rdfsparqlsemantic-web

getting count of rdf:list using SPARQL


Is this the correct/best SPARQL query to get the count of items in rdf:list:

select (COUNT(?a) AS ?count) where {?a http://www.w3.org/1999/02/22-rdf-syntax-ns#first ?c}

Thanks in advance for your help.

regards, rahul


Solution

  • Interesting question, as it seems simple enough and yet it's hard to correctly express the query. William Greenly's answer does not provide what you want, although he is perfectly right in his explanations and rightly use a property path. To be able to properly ask the right query that answers your question, it must be assumed that all lists are well formed (they only have one first element, one rest and they end with a nil).

    The problem in your query is that it will count all members of all lists in the dataset. You need something to relate the rdf:first to only the element of one list.

    If you have a URI identifying the list your are interested in, you can the following:

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  
    PREFIX ex:  <http://www.example.org/#>
    SELECT (COUNT(?member) AS ?count) 
    WHERE {
      ex:uriOfTheList  rdf:rest*/rdf:first  ?member
    }
    

    But often, lists are not identified by a URI. In this case, it is possible to identify certain lists by using other properties. For instance, imagine you have a ex:listOfAuthors property, you can do:

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    PREFIX ex:  <http://www.example.org/#>
    SELECT (COUNT(?member) AS ?count) 
    WHERE {
      ex:publication  ex:listOfAuthors  ?list .
      ?list  rdf:rest*/rdf:first  ?member .
    }
    

    Note that if you simply do:

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    SELECT (COUNT(?member) AS ?count) 
    WHERE {
      ?list  rdf:rest*/rdf:first  ?member .
    }
    

    you'll add up all the sizes of lists and sublists. Now things get complicated if you don't have predicates to which you can attach the list and you don't have a URI and maybe you want to get the count for all lists, per list. There is one way that should work:

    PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    SELECT (COUNT(?c) AS ?count) 
    WHERE {
      ?thing  !rdf:rest  ?list .
      ?list  rdf:rest*/rdf:first  ?member .
    }
    

    What this is saying is that we want to find something that connects to a list, but not with the predicate rdf:rest. In principle, only the start of a list is connected via a predicate to some other entity, if the entity is not a list itself and the predicate is not rdf:rest. Moreover, lists are normally always connected somehow to other entities, as there would be no point in describing a list independently of aything else.