So I am trying to do this:
"Generate a table of all the continents and the sum of the areas of all those lakes that contain at least one island for each continent. If a lake is in a country that is situated on several continents, the appropriate share of the lake area should be counted for each of those continents. "
using this XML database:https://www.dbis.informatik.uni-goettingen.de/Mondial/mondial.xml
I already have a SQL query that does this:
WITH LakesWithIslands AS (
SELECT DISTINCT IslandIn.lake, Lake.area, geo_lake.country
FROM ((IslandIn
INNER JOIN Lake ON IslandIn.lake = Lake.name)
INNER JOIN geo_lake ON IslandIn.lake = geo_lake.lake)
WHERE IslandIn.lake IS NOT NULL
)
SELECT B.continent, SUM(B.percentage * A.area / 100)
FROM LakesWithIslands as A
INNER JOIN encompasses as B ON A.country = B.country
GROUP BY B.continent;
But now I am trying to translate this SQL query into an Xquery.
XQuery code:
let $doc := doc("mondial.xml")/mondial
let $lakesWithIslands := $doc/island/data(@lake)
let $lakeData := (
for $l in $doc/lake
let $lakeId := $l/data(@id)
where $lakeId = $lakesWithIslands
return <lake id="{$lakeId}" country="{$l/data(@country)}" area="{$l/data(area)}" ></lake>
)
let $countryData :=(
for $c in $doc/country
let $countrycode := $c/data(@car_code)
let $encompassData := $c/encompassed
for $e in $encompassData
let $continent := $e/data(@continent)
let $percent := $e/data(@percentage)
return <country country="{$countrycode}" continent="{$continent}" percentage="{$percent}"></country>
)
for $l in $lakeData,
$c in $countryData
let $lCountry := $l/data(@country)
let $lArea := $l/data(@area)
let $cContinent := $c/data(@continent)
let $cCountry := $c/data(@country)
let $percent := $c/data(@percentage)
where $lCountry = $cCountry
group by $cContinent
return <item continent="{$cContinent}" area="{sum(($percent*$lArea) div 100)}"></item>
I am failing at the last step (last for-loop) and I am not sure what I am doing wrong. I just get an error that says "[XPTY0004] Item expected, sequence found: ("25", "25")".
I am pretty sure that I am doing the last for-loop completely wrong but I can't figure out how to use inner join, group by and sum together in Xquery.
Please help.
EDIT: I feel bad for posting so many question here regarding Xquery today but I am just having a real hard time wrapping my head around the practical parts of Xquery. Excuse me.
XQuery is "less declarative" than SQL in the sense that your code often has to follow the structure of the data more than in a typical SELECT
/FROM
/WHERE
SQL query. Joins can be performed naturally through nested FLWOR expressions, not only by gathering all data in pseudo-tables and then matching those.
For your concrete task it is easiest to iterate over all countries and the continents they are on, gather the relevant lakes and the part of their area for each country-continent combination and then group those by continent:
declare variable $lakes-with-islands :=
for $lake-id in distinct-values(doc('mondial.xml')//island/@lake)
return doc('mondial.xml')//lake[@id = $lake-id];
for $country in doc('mondial.xml')//country
let $lakes-in-country :=
$lakes-with-islands[contains-token(@country, $country/@car_code)]
for $encompassed in $country/encompassed
let $proportional-areas :=
for $lake in $lakes-in-country
return $lake/area * $encompassed/@percentage div 100
group by $continent := $encompassed/@continent
return <continent name="{$continent}">{sum($proportional-areas)}</continent>
This returns:
<continent name="europe">7875.5</continent>
<continent name="asia">24721.34</continent>
<continent name="africa">206610</continent>
<continent name="australia">242.66</continent>
<continent name="america">155036.7</continent>
This country lakes that are shared between multiple countries multiple times, but that seems to be intended by the problem description(?).
$lakes-with-islands
is declared as a static variable so that it is not touched by the group by
clause.