I have the following data, which I am trying to get the profit from the previous year:
WITH tbl (year, country, product, profit) AS (
VALUES
(2000, 'Finland', 'Computer' , 1500)
, (2000, 'Finland', 'Phone' , 100)
, (2001, 'Finland', 'Phone' , 10)
, (2000, 'India' , 'Calculator', 75)
, (2000, 'India' , 'Calculator', 75)
, (2000, 'India' , 'Computer' , 1200)
)
select country, year, profit
, lag(profit) over (partition by country order by year)
from tbl;
┌─────────┬──────┬────────┬──────────────────────────┐
│ country ┆ year ┆ profit ┆ sum_profit_previous_year │
╞═════════╪══════╪════════╪══════════════════════════╡
│ India ┆ 2000 ┆ 75 ┆ │
│ India ┆ 2000 ┆ 75 ┆ 75 │
│ India ┆ 2000 ┆ 1200 ┆ 75 │
│ Finland ┆ 2000 ┆ 1500 ┆ │
│ Finland ┆ 2000 ┆ 100 ┆ 1500 │
│ Finland ┆ 2001 ┆ 10 ┆ 100 │
└─────────┴──────┴────────┴──────────────────────────┘
However, this just seems to get the previous row, rather than what I want, which is to get the LAG
of the profit value for the previous year for that country. The expected result should be:
┌─────────┬──────┬────────┬──────────────────────────┐
│ country ┆ year ┆ profit ┆ sum_profit_previous_year |
╞═════════╪══════╪════════╪══════════════════════════╡
│ India ┆ 2000 ┆ 75 ┆ │
│ India ┆ 2000 ┆ 75 ┆ │
│ India ┆ 2000 ┆ 1200 ┆ │
│ Finland ┆ 2000 ┆ 1500 ┆ │
│ Finland ┆ 2000 ┆ 100 ┆ │
│ Finland ┆ 2001 ┆ 10 ┆ 1600 │
└─────────┴──────┴────────┴──────────────────────────┘
Since Finland-2001 is the only record that also has a record for that same country for the previous year. What would be the correct RANGE
clause here to accomplish this? (Either BigQuery or Postgres is fine for testing purposes).
If you indeed want all un-aggregated rows, and add the aggregated profit for the previous year ...
Here is a way with a plain window function and a custom window frame:
SELECT country, year, profit
, sum(profit) OVER (PARTITION BY country ORDER BY year
RANGE BETWEEN 1 PRECEDING AND 1 PRECEDING) AS sum_profit_previous_year
FROM tbl
ORDER BY country DESC, year; -- optional?