I have a query where I count with a grouping on years. For some of the years there is no items to count and thus no result. I would like to have SPARQL return a count of zero for such years.
I am using the Wikidata Query Service, https://query.wikidata.org, and my present solution is to make a series of values and a union with the actual query. It looks a bit clumsy to me. Is there a better way?
#defaultView:BarChart
select ?year ?number_of_pages ?work_label where {
{
select ?year (sample(?pages) as ?number_of_pages) ?work_label
where {
{
select * where {
values (?year ?pages ?work_label) {
("2000" "0"^^xsd:integer "_")
("2001" "0"^^xsd:integer "_")
("2002" "0"^^xsd:integer "_")
("2003" "0"^^xsd:integer "_")
("2004" "0"^^xsd:integer "_")
("2005" "0"^^xsd:integer "_")
("2006" "0"^^xsd:integer "_")
("2007" "0"^^xsd:integer "_")
("2008" "0"^^xsd:integer "_")
("2009" "0"^^xsd:integer "_")
("2010" "0"^^xsd:integer "_")
("2011" "0"^^xsd:integer "_")
("2012" "0"^^xsd:integer "_")
("2013" "0"^^xsd:integer "_")
("2014" "0"^^xsd:integer "_")
("2015" "0"^^xsd:integer "_")
("2016" "0"^^xsd:integer "_")
}
}
}
union {
?work wdt:P50 wd:Q18921408 .
?work wdt:P1104 ?pages .
?work wdt:P577 ?date .
?work rdfs:label ?long_work_label . filter(lang(?long_work_label) = 'en')
bind(substr(?long_work_label, 1, 20) as ?work_label)
bind(str(year(?date)) as ?year)
}
}
group by ?year ?work ?work_label
order by ?year
}
}
You don't need so many nested queries. Here is a simple solution:
#defaultView:BarChart
select ?year ?pages ?label
where {
# iterating over years
values ?year {
"2000" "2001" "2002" "2003" "2004" "2005"
"2006" "2007" "2008" "2009" "2010" "2011"
"2012" "2013" "2014" "2015" "2016"
}
# binding defaults
bind( 0 as ?default_pages)
bind("_" as ?default_label)
# if there is a work in the given year, ?work_pages and ?work_label will be bound
optional {
?work wdt:P50 wd:Q18921408;
wdt:P1104 ?work_pages;
wdt:P577 ?work_date.
bind(str(year(?work_date)) as ?year).
?work rdfs:label ?long_work_label.
filter(lang(?long_work_label) = 'en').
bind(substr(?long_work_label, 1, 20) as ?work_label)
}
# either take ?work_pages/label value or default and bind it as the result ?pages/label
bind(coalesce(?work_pages, ?default_pages) as ?pages)
bind(coalesce(?work_label, ?default_label) as ?label)
}
order by ?year
Here is results screenshot:
The key here is the combination of optional
+ bind
/coalesce
. The general pattern is
bind(... as ?default_foo)
optional {
# try to get value ?foo
}
bind(coalesce(?foo, ?default_foo) as ?result_foo)
coalesce
returns the first value that it can (that is bound/evaluates without error). So if the value you tried to get in optional { ... }
is not bound, the default one will be taken and bound as the result. A more verbose way to write it:
bind(if(bound(?foo), ?foo, ?default_foo) as ?result_foo)
But coalesce
is better because you can pass several values in it. In a more complicated query it can be useful: see this example.