Search code examples
sparqlrelationshiprdf

SPARQL: How can I group multiple fields?


I'm trying to use the GROUP BY and GROUP_CONCAT functions to group my results into single rows for each entity but I cannot make it work.

This is what I have: a Project has a Name and can have multiple Topics and Areas.

When I do this query:

SELECT ?proj ?name ?topic ?area
WHERE {
    ?proj hasName ?name .
    ?proj hasTopic ?topic .
    ?proj hasArea ?area .
}

I get this table of bindings:

proj name topic area
uri My Project building Europe
uri My Project building Asia
uri My Project park Europe
uri My Project park Asia

But what I would want is:

proj name topic area
uri My Project building, park Europe, Asia

I've tried using:

SELECT ?proj ?name
(GROUP_CONCAT(?topic; separator=", ") AS ?topics)
(GROUP_CONCAT(?area; separator=", ") AS ?areas)
WHERE {
    ?proj hasName ?name .
    ?proj hasTopic ?topic .
    ?proj hasArea ?area .
}
GROUP BY ?proj ?name

But I get the topics doubled: building, building, park, park.

What am I missing? How should I approach this?

Thanks a lot in advance


Solution

  • You are on the right path, but you just need to use the DISTINCT keyword. The following works for me.

    SELECT ?proj ?name
    (GROUP_CONCAT(DISTINCT ?topic; separator=", ") AS ?topics)
    (GROUP_CONCAT(DISTINCT ?area; separator=", ") AS ?areas)
    WHERE {
        ?proj hasName ?name .
        ?proj hasTopic ?topic .
        ?proj hasArea ?area .
    }
    GROUP BY ?proj ?name