Search code examples
sparqldbpedia

DBPedia: SPARQL query on field starting with literal


I am trying to get the respective DBPedia entry for a list of companies. I can't figure out how to do approximate matches. Example: "Audi" is called "Audi AG" in DBPedia and "Novartis" is called "Novartis International AG" (foaf:name). How do I search for entries with rdf:type = dbo:Company and name closest to whatever I provide?

I'm using SPARQL as the query language. (But I'm open to change if there is an advantage.)

select ?company
where {
  ?company foaf:name "Novartis"@en.
  ?company a dbo:Company.
}
LIMIT 100

I get no hit but http://dbpedia.org/page/Novartis should be found. Matching the beginning of the name might be good enough to get this.


Solution

  • For DBpedia, the best option might be to use the bif:contains full-text search pseudo property:

    SELECT ?company {
      ?company a dbo:Company.
      ?company foaf:name ?name.
      ?name bif:contains "Novartis"@en.
    }
    

    This feature is specific to the Virtuoso database that powers the DBpedia SPARQL endpoint.

    If you want to stick to standard SPARQL, to match at the beginning of the name only:

    SELECT ?company {
      ?company a dbo:Company.
      ?company foaf:name ?name.
      FILTER strStarts(?name, "Novartis")
    }
    

    Unlike the full-text feature, this version cannot make use of a text index, so it is slower.

    If you want a more flexible match:

    SELECT ?company {
      ?company a dbo:Company.
      ?company foaf:name ?name.
      FILTER contains(lCase(?name), lCase("Novartis"))
    }
    

    This will find a case-insensitive match anywhere in the name.