Search code examples
sparqlrdfjenasemantic-webfuseki

SPARQL Multi-Valued properties - Rendering Results


I am new to SPARQL, and graph database querying as a whole so please excuse any ignorance but I am trying to write a basic output using some data stored within Fueski and am struggling to understand the best practice for handling duplication of rows due to the cardinality that exist between the various concepts.

I will use a simple example to hopefully demonstrate my point.

Data Set

This is a representative sample of the types of data and relationships I am currently working with;

Data Set

Based on this structure I have produced the following triples (N-Triple format);

<http://www.test.com/ontologies/Author/JohnGrisham>  <http://www.test.com/ontologies/property#firstName> "John" .
<http://www.test.com/ontologies/Author/JohnGrisham> <http://www.test.com/ontologies/property#lastName> "Grisham" .
<http://www.test.com/ontologies/Author/JohnGrisham> <http://www.test.com/ontologies/property#hasWritten> <http://www.test.com/ontologies/Book/TheClient> .
<http://www.test.com/ontologies/Author/JohnGrisham> <http://www.test.com/ontologies/property#hasWritten> <http://www.test.com/ontologies/Book/TheFirm> .

<http://www.test.com/ontologies/Book/TheFirm> <http://www.test.com/ontologies/property#name> "The Firm" .
<http://www.test.com/ontologies/Book/TheFirm> <http://www.test.com/ontologies/property#soldBy> <http://www.test.com/ontologies/Retailer/Foyles> .
<http://www.test.com/ontologies/Book/TheFirm> <http://www.test.com/ontologies/property#soldBy> <http://www.test.com/ontologies/Retailer/Waterstones> .

<http://www.test.com/ontologies/Book/TheClient> <http://www.test.com/ontologies/property#name> "The Client" .
<http://www.test.com/ontologies/Book/TheClient> <http://www.test.com/ontologies/property#soldBy> <http://www.test.com/ontologies/Retailer/Amazon> .
<http://www.test.com/ontologies/Book/TheClient> <http://www.test.com/ontologies/property#soldBy> <http://www.test.com/ontologies/Retailer/Waterstones> .

<http://www.test.com/ontologies/Retailer/Amazon> <http://www.test.com/ontologies/property#name> "Amazon" .
<http://www.test.com/ontologies/Retailer/Waterstones> <http://www.test.com/ontologies/property#name> "Waterstones" .
<http://www.test.com/ontologies/Retailer/Foyles> <http://www.test.com/ontologies/property#name> "Foyles" .

Render Output Format

Now what I am trying to do is render a page where all authors are displayed showing details of all the books and the retailers in which those individual books are sold. so something like this (pseudo code);

for-each:Author

   <h1>Author.firstName + Author.lastName</h1>

   for-each:Author.Book

     <h2>Book.Name</h2>

     Sold By:
     for-each:Book.Retailer

         <h2>Retailer.name</h2>

SPARQL

For the rendering to work my thinking was I would need the author's First name and last name, then all book names they have and the various retailer names those books are sold through and therefore I came up with the following SPARQL;

PREFIX p: <http://www.test.com/ontologies/property#>

SELECT ?authorfirstname 
       ?authorlastname 
       ?bookname 
       ?retailername
WHERE {
    ?author p:firstName ?authorfirstname;
           p:lastName ?authorlastname;
               p:hasWritten ?book .
    OPTIONAL {
        ?book p:name ?bookname;
                  p:soldBy ?retailer .
        ?retailer p:name ?retailername .
    }
}

This provides the following results;

Results Triple Table

Unfortunately due to the duplication of rows my basic rendering attempt cannot produce output as expected, in fact it's rendering a new "Author" section for every row returned from the query.

I guess what I'm trying to understand is how should this type of rendering should be done.

  • Is it the renderer that is supposed to regroup data back into the graph form it wants to travese (I honestly cannot see how this can be the case)

  • Is the SPARQL invalid - is there a way to do what I want in the SPARQL language itself?

  • Am I just doing something completely wrong?

AMENDMENT - More Detailed Analysis on GROUP_CONCAT

When reviewing the options available to me I came across GROUP_CONCAT but after a bit of playing with it decided it probably wasn't the option that was going to give me what I wanted and probably wasn't the best route. The reasons for this are;

Data Size

Whilst the data set I am running my examples over in this post is small only spanning 3 concepts and a very restricted data set the actual concepts and data I am running against in the real world is far far larger where concatenating results will produce extremely long delimitered strings, especially for free format columns such as descriptions.

Loss of context

Whilst trying out group_concat I quickly realised that I couldn't understand the context of how the various data elements across the group_concat columns related.. I can show that by using the book example above.

SPARQL

PREFIX p: <http://www.test.com/ontologies/property#>

select ?authorfirstname 
        ?authorLastName 
        (group_concat(distinct ?bookname; separator = ";") as ?booknames)
        (group_concat(distinct ?retailername; separator = ";") as ?retailernames)
where {
  ?author p:firstName ?authorfirstname;
          p:lastName ?authorLastName;
          p:hasWritten ?book .
    OPTIONAL {
        ?book p:name ?bookname;
              p:soldBy ?retailer .
        ?retailer p:name ?retailername .
    }
}
group by ?authorfirstname ?authorLastName

This produced the following output;

firstname = "John"
lastname  = "Grisham"
booknames = "The Client;The Firm"
retailernames = "Amazon;Waterstones;Foyles"

As you can see this has produced one result row but you can no longer work out how the various data elements relate. Which Retailers are for which Book?

Any help/guidance would be greatly appreciated.

Current Solution

Based on the recommended solution below I have used the concept of keys to bring the various data sets togehter however I have tweeked it slightly so that I am using a query per concept (E.g. author, book and retailer) and then used the keys to bring together the results in my renderer.

Results

          firstname  lastname   books
--------------------------------------------------------------------------------
1          John       Grisham  ontologies/Book/TheClient|ontologies/Book/TheFirm

Book Results

          id                        name        retailers
-------------------------------------------------------------------------------------------------------
1          ontologies/Book/TheClient The Client   ontologies/Retailer/WaterStones|ontologies/Retailer/Amazon
2          ontologies/Book/TheFirm   The Firm     ontologies/Retailer/WaterStones|ontologies/Retailer/Foyles

Retailer Results

          id                             name  
-------------------------------------------------- 
1          ontologies/Retailer/Amazon      Amazon
2          ontologies/Retailer/Waterstones Waterstones
3          ontologies/Retailer/Foyles      Foyles

What I then do in my renderer is use the ID's to pull results from the various result sets...

 for-each author a : authors
    output(a.firstname)
    for-each book b : a.books.split("|")
    book = books.get(b) // get the result for book b (e.g. Id to Foreign    key)
      output(book.name)
      for-each retailer r : book.retailers.split("|")
        retailer = retailers.get(r)
        output(retailer.name)

So effectively you are stitching together what you want from the various different result sets and presenting it.

This seems to be working OK for the moment.


Solution

  • I find it easier to construct objects out of the SPARQL results in code rather than trying to form a query that returns only a single row per the relevant resource.

    I would use the URI of the resources to identify which rows belong to which resource (author in this case), and then merge the result rows based on said URI.

    For JS applications I use the code here to construct objects out of SPARQL results.

    For complex values I use __ in the variable name to denote that an object should be constructed from the value. For example all values with variables prefixed with ?book__ would be turned into an object with the remainder of the variable's name as the name of the object's attribute, each object identified by ?book__id. So having values for ?book__id and ?book__name would result in an attribute book for the author, such that author.book = { id: '<book-uri>', name: 'book name'} (or a list of such objects if there are multiple books).

    For example in this case I would use the following query:

    PREFIX p: <http://www.test.com/ontologies/property#>
    
    SELECT ?id ?firstName ?lastName ?book__id ?book__name
           ?book__retailer
    WHERE {
        ?id p:firstName ?firstName;
               p:lastName ?lastName;
               p:hasWritten ?book__id .
        OPTIONAL {
            ?book__id p:name ?book__name;
              p:soldBy/p:name ?book__retailer .
        }
    }
    

    And in the application code I would construct Author objects that look like this (JavaScript notation):

    [{
        id: '<http://www.test.com/ontologies/Author/JohnGrisham>',
        firstName: 'John',
        lastName: 'Grisham',
        book: [
            {
                id: '<http://www.test.com/ontologies/Book/TheFirm>',
                name: 'The Firm',
                retailer: ['Amazon', 'Waterstones', 'Foyles']
            },
            {
                id: '<http://www.test.com/ontologies/Book/TheClient>',
                name: 'The Client',
                retailer: ['Amazon', 'Waterstones', 'Foyles']
            }
        ]
    }]