Search code examples
dbpedia

dbpedia - are only English articles indexed?


I would like to query dbpedia for articles in different languages, e.g. Hungarian. Here is an example query: it searches for articles with the name 'Budapest' (capital of Hungary).

http://dbpedia.org/sparql

PREFIX dbprop: <http://dbpedia.org/property/>
PREFIX db: <http://dbpedia.org/resource/>
SELECT ?article, ?url, ?name WHERE {
 ?article foaf:isPrimaryTopicOf ?url .
 ?article foaf:name ?name
 FILTER regex(?name, 'Budapest')
}
LIMIT 100

note: the query takes a while to execute because of the regex matching.

There are Wikipedia articles with this name in both English and Hungarian, however the query gives English articles only (all urls are under the en.wikipedia.org domain).

Are articles on other languages indexed in dbpedia?, if so, how can I modify the query to find the Hungarian articles too?


Solution

  • Yes only English literals are in the public endpoint (including abstracts). If you want to query other language abstracts:

    1. prepare a triplestore on your localhost (e.g. Virtuoso).
    2. insert the long-abstracts_hu.ttl.bz2 file (Hungarian dbpedia) intoa graph of your choice. (note: you might have to extract or convert the .bz2 file to .gz first - depending on the triple store)
    3. do a federated query over the public dbpedia endpoint and your local store

    If you run into trouble, feel free to ask for assistance.