Search code examples
sparqlrdfwikidataskos

Parsing returns from AltLabel in a SPARQL query


In a Wikidata SPARQL query such as the following, I want to be able to use a custom delimiter for the returns for ?placeOfBirthAltLabel.

The problem is that some values under ?placeOfBirthAltLabel contain commas

e.g. synonyms for "New York" include "New York, USA" as a single entry.

However, as the returns are comma delimited, this single entry will be parsed as two separate strings.

So in other words I need the return to be [New York, USA ; NYC ; NYC, USA ] as opposed to [New York, USA, NYC, NYC, USA]

SELECT ?item ?itemLabel ?placeOfBirthLabel ?placeOfBirthAltLabel 
WHERE
{
  ?item wdt:P106 wd:Q10833314.
  OPTIONAL { ?item wdt:P19 ?placeOfBirth }

  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

LIMIT 100

Thanks!


Solution

  • You do not need to parse alternative labels. Their values are concatenated by the label service. Just do not use the label service for alternative labels:

    SELECT ?item ?itemLabel ?placeLabel ?place_alt_label WHERE { 
        ?item wdt:P106 wd:Q10833314. 
        OPTIONAL { 
            ?item wdt:P19 ?place .
            OPTIONAL {
                ?place skos:altLabel ?place_alt_label .
                FILTER (lang(?place_alt_label)='en')
                }
        }  
        SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    

    Try it!

    If you still want to parse... The comma is hardcoded, use grouping and GROUP_CONCAT with custom separator instead:

    SELECT ?item ?itemLabel ?placeLabel
        (GROUP_CONCAT(?place_alt_label; separator='; ') AS ?4) WHERE { 
        ?item wdt:P106 wd:Q10833314. 
        OPTIONAL { 
            ?item wdt:P19 ?place .
            OPTIONAL {
                ?place skos:altLabel ?place_alt_label .
                FILTER (lang(?place_alt_label)='en')
                }
        }  
        SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
    }
    GROUP BY ?item ?itemLabel ?placeLabel
    

    Try it!


    Be carefull with variables projected by the label service. For example,

    SELECT ?item ?itemLabel ?placeLabel   {...}
    GROUP BY ?item ?itemLabel ?placeLabel
    

    should work, whereas

    SELECT ?item ?itemLabel (SAMPLE(?placeLabel) AS ?3)   {...}
    GROUP BY ?item ?itemLabel
    

    shouldn't.