Search code examples
utf-8sparqldbpedia

Two "identical" answers to a request (utf8 and non-utf8)


I am having problems with a live.dbpedia SPARQL request, for it returns some entries twice (once as an utf8 URI, once as a non-utf8 URI : Here are the results.

Is it something that needs to be fixed inside of dbpedia (where should it be reported)?

Is there a way to keep only one version of these duplicated urls? (I do not want to ignore a non-utf8 URI if there is no utf8 counterpart)

P.S.: The actual request

select distinct ?name where {
   ?name <http://purl.org/dc/terms/subject><http://dbpedia.org/resource/Category:Individual_graphs>.
   } ORDER BY desc(?name) LIMIT 2   

Solution

  • Even though there are multiple URIs that can identify the article, they all have the same article title, so you can extract the title (it's the value of the rdfs:label property), group by that, and then sample the URIs. Doing that, along with using the built-in DBpedia namespaces, I end up with this query:

    select distinct (sample(?name_) as ?name) where {
      ?name_ dcterms:subject category:Individual_graphs ;
            rdfs:label ?label
    }
    group by ?label
    order by desc(?name)
    

    SPARQL results