How do I count grouped entries in SPARQL, merging entries whose quantity is less than a specific factor?
Consider for example the Nobel Prize data. I could get a count of all family names with a query like
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name (count(*) as ?count) WHERE {
?id foaf:familyName ?name
}
GROUP BY $name
ORDER BY DESC($count)
How do I modify the query so it only returns the family names occuring at least 3 times, accumulating the other names as other.
Just wrap your SELECT
into another one.
Query
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT ?name_ (SUM(?count) AS ?count_) {
{
SELECT ?name (COUNT(*) AS ?count) {
?id foaf:familyName ?name
} GROUP BY ?name
}
BIND (IF(?count > 2, ?name, "Other") AS ?name_)
} GROUP BY ?name_ ORDER BY DESC(IF(?name_ = "Other", -1 , ?count_))
Results
name_ count_
----------- ---------
Smith 5
Fischer 4
Wilson 4
Lee 3
Lewis 3
Müller 3
Other 878