Search code examples
sparqldbpedia

DBPedia SPARQL, return certain number of relevant page URIs for entity EXCEPT the URIs where the entity belongs to a set of subclasses of Owl:Thing


  1. Looking for SPARQL query to do the following:

For example, I have the word Apple. Apple may refer to the organization Apple_Inc or the Species of Plants class as per the ontology. Owl: Thing has a subclass called Species, so I want to return those most relevant/maximum-hit URIs where the keyword Apple does not belong to the Species subclass. So when you return all the URIs, http://dbpedia.org/page/Apple should not be one of them, neither must ANY relevant link that comes under Species subclass.

By maximum-hit/most relevant I mean the top returned results that match the query! Like when you access the PrefixSearch (i.e. Autocomplete) API, it has the parameter called MaxHits.

For example http://lookup.dbpedia.org/api/search/PrefixSearch?QueryClass=&MaxHits=2&QueryString=berl is a link where you want to return the top 2 URIs that match the QueryString=berl.

Like I'm actually really struggling to even explain the work I've done so far because I'm not able to understand the structure and how to formulate a proper query..

with respect to negation in SPARQL, I found a relevant portion of the documentation in the link here.. But I do not know how and where to proceed from there, and cannot understand why keywords like ?person are used.. I can understand the person is used to selected well.. PEOPLE names, but I would like to know how and where to find these keywords like ?person, ?name to represent a specific entity..

SELECT ?uri ?label
WHERE {
?uri rdfs:label ?label .
filter(?label="car"@en)
}

I would really appreciate if someone could link me the part of the documentation I can clearly read and understand that ?uri is used to select a URI in the form www.dbpedia.org'/page/SomeEntity and what these ?person, ?name, ?label represent.

I'm actually so lost.. I will go up and start eating one elephant at a time. For now, I'll be very grateful if I get an answer to this.

  1. If there is anyway you know where I can avoid learning and using SPARQL, that would work too! I know Python well enough, so leveraging an API to pull this information is also fine by me. This question was posted by me.

Solution

  • Answer posted by @Stanislav-Kravin --

    SELECT DISTINCT ?s 
    WHERE
      { ?s a owl:Thing .
        ?s rdfs:label ?label . 
          FILTER ( LANGMATCHES ( LANG ( ?label ), 'en' ) ) 
        ?label bif:contains '"apple"' . 
          FILTER NOT EXISTS { ?s rdf:type/rdfs:subClassOf* dbo:Species } 
      }