Search code examples
sparqldbpediavirtuoso

sparql with extended characters


I am trying to query dbpedia using sparql and running into a problem with extended characters for accents etc.

For example this query for Albrecht_Dürer

select ?abstract ?thumbnail where { 
        dbpedia:Albrecht_D%C3%BCrer dbpedia-owl:abstract ?abstract ;
                                 dbpedia-owl:thumbnail ?thumbnail .
        filter(langMatches(lang(?abstract),"en"))}

I read a post on here suggesting using the \u switch for unicode but that didn't work, i.e. dbpedia:Albrecht_D\uC3BCrer

In every case I get an error from the SPARQL compiler when I try it in Virtuoso. What am I doing wrong?

UPDATE:

I have managed to put the URL in as follows:

SELECT * WHERE
{
  <http://dbpedia.org/resource/Albrecht_Dürer> dbpedia-owl:abstract ?abstract ;
                                     dbpedia-owl:thumbnail ?thumbnail .
            filter(langMatches(lang(?abstract),"en"))
}

It now "succeeds" but doesn't return any results even though I can see Albrecht_Dürer has an abtract and thumbnail when I look at the page specified.

The syntax is ok because I get results from this:

SELECT * WHERE
{
  <http://dbpedia.org/resource/Elvis_Presley> dbpedia-owl:abstract ?abstract ;
                                     dbpedia-owl:thumbnail ?thumbnail .
            filter(langMatches(lang(?abstract),"en"))
}

Anyone know why?


Solution

  • I believe you cannot put the unicode characters in the URI. What you could do is to filter them through a regular expression. Here is what I came up with:

    select distinct ?abstract ?thumbnail where 
    {?uri rdfs:label ?label.
    ?uri dbpedia-owl:abstract ?abstract ;
    dbpedia-owl:thumbnail ?thumbnail . 
    FILTER (REGEX(STR(?label), "^Albrecht D\u00fcrer")) .
    FILTER(langMatches(lang(?abstract),"en")).
    }