I am trying to fetch artist info from wikipedia using Dbpedia gem https://github.com/farbenmeer/dbpedia
But I am unable to figure out what is the genre of a result item.
Basically I want to modify following function to find out which result is an artist and then return its url:
def self.get_slug(q)
results = Dbpedia.search(q)
result = # Do something to find out the result that is an artist
uri = result.uri rescue ""
return uri
end
The last resort will be for me to scrape each result url and then find out if it is an artist or not based on if there is genre info available.
You could leverage from DBpedia's SPARQL endpoint, rather than scrapping over all results.
Suppose you want a list of everything that has a genre
. You could query:
SELECT DISTINCT ?thing WHERE {
?thing dbpedia-owl:genre ?genre
}
LIMIT 1000
But say you don't want everything, you're looking just for artists. It could be a musician, a painter, an actor, etc.
SELECT DISTINCT ?thing WHERE {
?thing dbpedia-owl:genre ?genre ;
rdf:type dbpedia-owl:Artist
}
LIMIT 1000
Or maybe you just want musicians OR bands:
SELECT DISTINCT ?thing WHERE {
{
?thing dbpedia-owl:genre ?genre ;
rdf:type dbpedia-owl:Band
}
UNION
{
?thing dbpedia-owl:genre ?genre ;
a dbpedia-owl:MusicalArtist # `a` is a shortcut for `rdf:type`
}
}
LIMIT 1000
Ultimately, you want musicians or bands that have "mega" in their names, e.g. Megadeath or Megan White, along with the URL of the resource.
SELECT DISTINCT ?thing, ?url, ?genre WHERE {
?thing foaf:name ?name ;
foaf:isPrimaryTopicOf ?url .
?name bif:contains "'mega*'" .
{
?thing dbpedia-owl:genre ?genre ;
a dbpedia-owl:Band
}
UNION
{
?thing dbpedia-owl:genre ?genre ;
a dbpedia-owl:MusicalArtist
}
UNION
{
?thing a <http://umbel.org/umbel/rc/MusicalPerformer>
}
}
LIMIT 1000
Give it a try to this queries using the DBpedia's SPARQL Query Editor.
The dbpedia gem you pointed out, reveals the sparql-client in its API. So, I think you will be able to run all this queries using the #query
method
Dbpedia.sparql.query(query_string)
Best luck!