I'm engaging in a project that stores 2 RDF Data Cubes:
graph : http://sda-research.ml/graph/climate
Dataset-climate
ds:obs5 a qb:Observation;
qb:dataSet ds:dataset-climate;
prop:city "Ha Noi"@en;
prop:cityid "hanoi";
prop:humidity 8.17E1;
prop:rainfall 2.1668E3;
prop:year "2016"^^xsd:int .
ds:obs6 a qb:Observation;
qb:dataSet ds:dataset-climate;
prop:city "Ha Noi"@en;
prop:cityid "hanoi";
prop:humidity 8.18E1;
prop:rainfall 2.6402E3;
prop:year "2017"^^xsd:int .
graph : http://sda-research.ml/graph/industry
Dataset-industry
ds:obs205 a qb:Observation;
qb:dataSet ds:dataset-industry;
prop:city "Hà Nội"@en;
prop:cityid "hanoi";
prop:industry 1.073E2;
prop:year "2016"^^xsd:int .
ds:obs206 a qb:Observation;
qb:dataSet ds:dataset-industry;
prop:city "Hà Nội"@en;
prop:cityid "hanoi";
prop:industry 1.07E2;
prop:year "2017"^^xsd:int .
Now I want to merge 2 graphs for the output that contain humidity and industry value of Hanoi in 2016-2017. On GraphDB SPARQL Endpoint, I used this query:
PREFIX qb: <http://purl.org/linked-data/cube#>
PREFIX prop: <http://www.sda-research.ml/dc/prop/>
select ?city ?year ?temperature ?industry
where{
{graph ?g {
?obs a qb:Observation.
?obs prop:cityid ?cityid filter regex(?cityid, 'hanoi').
?obs prop:city ?city.
?obs prop:year ?year filter(?year >= 2017 && ?year <= 2018 ).
?obs prop:temperature ?temperature.
}
}
UNION
{graph ?g {
?obs a qb:Observation.
?obs prop:cityid ?cityid filter regex(?cityid, 'hanoi').
?obs prop:city ?city.
?obs prop:year ?year filter(?year >= 2016 && ?year <= 2017).
?obs prop:industry ?industry.
}
}
}
Expected output:
city------year------humidity------industry---
Ha Noi-----2016-------8.17E1------ 1.073E2---
Ha Noi-----2017-------8.18E1-------1.07E2----
Actual output:
city------year------humidity------industry--
Ha Noi-----2016-------8.17E1--------null----
Ha Noi-----2017-------8.18E1--------null----
Ha Noi-----2016--------null--------1.073E2--
Ha Noi-----2017--------null--------1.07E2---
How can I remove the null value when using UNION, or do you have any query that give the correctly expected result?
There are several issues with your query before we get into the SPARQL itself.
Now in terms of SPARQL issues.
?cityid
and ?city
, but the value of ?city
is spelt differently across named graphs, namely "Hà Nội"@en
and "Ha Noi"@en
.?g
for your named graphs. This means that the 2/4 results are obtained by looking at the climate graph, whereas the second two results by looking at the industry graph.
When you have a specific graph in mind from which to extract sources, you should specify it.REGEX
. Different triplestores implement query planning differently, but this is an expensive operation that may significantly worsen your performance. See below for how to deal with this by using the values
keyword.Now here is a slightly amended query that produces the results you're after:
PREFIX qb: <http://purl.org/linked-data/cube#>
PREFIX prop: <http://www.sda-research.ml/dc/prop/>
select ?cityid ?year ?humidity ?industry
where{
values ?cityid {'hanoi'}
graph <http://sda-research.ml/graph/climate> {
?obs1 a qb:Observation.
?obs1 prop:cityid ?cityid.
?obs1 prop:year ?year filter(?year >= 2016 && ?year <= 2017 ).
?obs1 prop:humidity ?humidity.
}
graph <http://sda-research.ml/graph/industry> {
?obs2 a qb:Observation.
?obs2 prop:cityid ?cityid.
?obs2 prop:year ?year filter(?year >= 2016 && ?year <= 2017).
?obs2 prop:industry ?industry.
}
}