Search code examples
sparqlsemantic-webdbpedia

How to preserve the language tag in group_concat in SPARQL


I am trying to find the labels for each uri in as many languages as possible. As an example I run this

select ?film group_concat (?filmLabel;separator="|") where{
  ?film a dbpedia-owl:Film;rdfs:label ?filmLabel  . FILTER (regex(?film,'Titanic'))
} 

The result looks like this Titanic

I get the labels in different languages but group_concat strips the language tag so I won't know what label belongs to what language. I can use OPTIONAL for each language but it is going to be very messy. What should I do to have the results being aggregated into this format "Titanic_(1953_film)"@en|"Der Untergang der Titanic"@de|"Titanic (película de 1953)"@es so when I post process them I can obtain their corresponding language.


Solution

  • Instead of just group_concat'ing the film label, you can group_concat an expression built each row. In particular, if group_concat on concat('"',?filmLabel,'"@',lang(?filmLabel), you'll get results like what you wanted:

    select ?film
           (group_concat( concat('"',?filmLabel,'"@',lang(?filmLabel)); separator="|" ) as ?label)
    where {
      ?film a dbpedia-owl:Film ;
            rdfs:label ?filmLabel .
      FILTER (regex(?film,'Titanic'))
    } 
    

    SPARQL results

    film:  http://dbpedia.org/resource/Titanic_(1953_film)
    label: "泰坦尼克号 (1953年电影)"@zh|"Titanic (1953 film)"@en|"Der Untergang der Titanic"@de|"Titanic (película de 1953)"@es|"Titanic (film, 1953)"@fr|"Titanic (film 1953)"@it|"タイタニックの最期"@ja|"Titanic (1953)"@nl|"Titanic (film 1953)"@pl|"Titanic (1953)"@pt|"Титаник (фильм, 1953)"@ru
    
    film:  http://dbpedia.org/resource/Titanic:_Blood_and_Steel
    label: "Titanic – Blood and Steel"@de|"Titanic: Blood and Steel"@en|"Titanic: sangre y acero"@es|"Titanic : de sang et d'acier (série télévisée, 2012)"@fr|"Titanic - Nascita di una leggenda"@it|"Титаник: Кровь и сталь"@ru