I wanted to query the movies that have the highest number of shared type with Matrix movie.
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT ?movie_name (count(distinct ?atype) as ?numatype)
FROM <http://dbpedia.org/>
WHERE {
?movie rdf:type dbo:Film;
rdf:type ?ftype.
dbr:The_Matrix rdf:type ?ttype.
?atype a owl:class;
owl:intersectionOf [?ftype ?ttype].
?movie rdfs:label ?movie_name.
FILTER (LANG(?movie_name)="en").
}
GROUP BY ?movie_name
ORDER BY DESC(?numatype)
LIMIT 100
I defined ?ttype as the type for The matrix movie and ?ftype as the type of ?movie.
when I query this in http://dbpedia.org/sparq there are no results.
The idea is to use a simple join on the types:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT (SAMPLE(?l) as ?movie_name)
(count(distinct ?ttype) as ?numSharedTypes)
WHERE {
VALUES ?s {dbr:The_Matrix}
?s a ?ttype .
?movie a dbo:Film ;
a ?ttype .
FILTER(?movie != ?s)
?movie rdfs:label ?l .
FILTER (LANGMATCHES(LANG(?l), 'en'))
}
GROUP BY ?movie
ORDER BY desc(?numSharedTypes)
LIMIT 100
The JOIN itself might be expensive, thus, you could get a timeout resp. due to the anytime feature of Virtuoso get an incomplete result back.
It looks like the query optimizer isn't that smart enough, especially the labels make the performance worse. A bunch of sub-SELECTs make it much faster, although more complex in reading the query:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dbr: <http://dbpedia.org/resource/>
SELECT ?movie_name ?numSharedTypes
WHERE
{ ?movie rdfs:label ?l
FILTER langMatches(lang(?l), "en")
BIND(replace(replace(str(?l), "\\(film\\)$", ""), "[^0-9]*\\sfilm\\)$", ")") AS ?movie_name)
{ SELECT ?movie (COUNT(?type) AS ?numSharedTypes)
WHERE
{ ?movie rdf:type dbo:Film ;
rdf:type ?type
{ SELECT ?type
WHERE
{ dbr:The_Matrix rdf:type ?type
}
}
FILTER ( ?movie != dbr:The_Matrix )
}
GROUP BY ?movie
ORDER BY DESC(?numSharedTypes) ASC(?movie)
LIMIT 100
}
}
ORDER BY DESC(?numSharedTypes) ASC(?movie_name)
+------------------------+----------------+
| movie_name | numSharedTypes |
+------------------------+----------------+
| The Matrix Reloaded | 36 |
| The Matrix Revolutions | 33 |
| The Matrix (franchise) | 30 |
| Demolition Man | 28 |
| Freejack | 28 |
| Conspiracy Theory | 27 |
| Deep Blue Sea (1999) | 27 |
| Fair Game (1995) | 27 |
| Judge Dredd | 27 |
| Revenge Quest | 27 |
| Screamers (1995) | 27 |
| Soldier (1998) | 27 |
| The Invasion | 27 |
| Timecop | 27 |
| Total Recall (1990) | 27 |
| V for Vendetta | 27 |
| Assassins | 26 |
| ... | ... |
+------------------------+----------------+