Search code examples
sparqlrdf

get a variable number of columns for output in sparql


Is there a way to get a variable number of columns for a given predicate? Essentially, I want to turn this:

title note
A.    1
A.    2
A.    3
B.   4
B.   5

into

title note1 note2 note3
A.    1     2     3
B.    4     5     null

Like, can i set the columns created to the maximum number of "notes" in the query or something. Thanks.


Solution

  • There are several ways you can approach this. One way is to change your query. Now, in the general case it is not possible to do a SELECT query that does exactly what you want. However, if you happen to know in advance what the maximum number of notes per title is, you can sort of do this.

    Supposing your original query was something like this:

    SELECT ?title ?note
    WHERE { ?title :hasNote ?note }
    

    And supposing you know titles have at most 3 notes, you could probably (untested) do something like this:

    SELECT ?title ?note1 ?note2 ?note3
    WHERE { 
            ?title :hasNote ?note1 .
            OPTIONAL { ?title :hasNote ?note2 . FILTER (?note2 != ?note1) }
            OPTIONAL { ?title :hasNote ?note3 . FILTER (?note3 != ?note1 && ?note3 != ?note2) }
    }
    

    As you can see this is not a very nice solution though: it doesn't scale and is probably very inefficient to process as well.

    Alternatives are various forms of post-processing. To make it simpler to post-process you could use an aggregate operator to get all notes for a single item on a single line at least:

    SELECT ?title (GROUP_CONCAT(?note) as ?notes) 
    WHERE { ?title :hasNote ?note }
    GROUP BY ?title
    

    result:

    title notes
    A.    "1 2 3"
    B.    "4 5"
    

    You could then post-process the values of the ?notes variable to split them into the separate notes again.

    Another solution is that instead of using a SELECT query, you use a CONSTRUCT query to give you back an RDF graph, rather than a table, and work directly with that in your code. Tables are kinda weird in an RDF world if you think about it: you're querying a graph model, why is the query result not a graph but a table?

    CONSTRUCT
    WHERE { ?title :hasNote ?note }
    

    ...and then process the result in whatever API you're using to do the queries.