In the following query:
SELECT
d.date,
d.location,
d.population,
v.new_vaccinations,
SUM(CAST(v.new_vaccinations AS int)) OVER (PARTITION BY d.location) AS total_vaccinations
FROM
CovidDeaths d
JOIN
CovidVaccinations v
ON
d.date = v.date
AND
d.location = v.location
WHERE
d.continent IS NOT NULL
ORDER BY
d.location,
d.date
the window function is simply giving the total of each location as opposed to the running total according to the date. What am I doing wrong?
To get a running cumulative total, either RANGE
or ROWS
window must be specified or implied in the OVER
, and therefore you also need to add an ORDER BY
. Otherwise the window is the full partition.
Once you specify ORDER BY
, the default implied window is RANGE UNBOUNDED PRECEDING
, which gives different results for non-unique orderings from ROWS
.
Given that date
is probably not unique, you also want ROWS UNBOUNDED PRECEDING
.
SELECT
d.date,
d.location,
d.population,
v.new_vaccinations,
SUM(CAST(v.new_vaccinations AS int))
OVER (PARTITION BY d.location ORDER BY d.date ROWS UNBOUNDED PRECEDING) AS total_vaccinations
FROM
CovidDeaths d
JOIN
CovidVaccinations v
ON
d.date = v.date
AND
d.location = v.location
WHERE
d.continent IS NOT NULL
ORDER BY
d.location,
d.date