This is a sample of my data and what I want to calculate is in the rightmost column using the other columns:
DATE | Item | Brand | Cycle_length | Brand_Avg_Cycle_Length_3_Weeks |
---|---|---|---|---|
13/09/2023 | 123 | Apple | 6 | |
13/09/2023 | 500 | Apple | 5 | |
20/09/2023 | 123 | Apple | 6 | |
20/09/2023 | 500 | Apple | 5 | |
27/09/2023 | 123 | Apple | 6 | 5.333333 |
27/09/2023 | 500 | Apple | 4 | 5.333333 |
04/10/2023 | 123 | Apple | 6 | 5.167777 |
04/10/2023 | 500 | Apple | 4 | 5.167777 |
13/09/2023 | 325 | Samsung | 7 | |
13/09/2023 | 862 | Samsung | 3 | |
13/09/2023 | 455 | Samsung | 5 | |
20/09/2023 | 325 | Samsung | 7 | |
20/09/2023 | 862 | Samsung | 3 | |
27/09/2023 | 455 | Samsung | 5 | |
04/10/2023 | 325 | Samsung | 7 | 5.333333 |
27/09/2023 | 862 | Samsung | 4 | 5.333333 |
04/10/2023 | 455 | Samsung | 7 | 5.333333 |
11/10/2023 | 325 | Samsung | 7 | 5.666667 |
04/10/2023 | 862 | Samsung | 4 | 5.666667 |
11/10/2023 | 455 | Samsung | 7 | 5.666667 |
The rightmost column is taking all the items in a brand and looking back 3 weeks including current (so 3 rows for each item) and calculating the average of the cycle length column. This is what I'm unable to calculate in Snowflake.
The bold values in the Cycle_length column are being used to calculate the bold values in the Brand_avg_cycle_length_3_weeks column. This is calculated for each item separately.
I've tried a number of things including partition by (with preceding) function but no luck. The blocker is that preceding only looks back a certain number of rows whereas I want to look back: number of weeks multiplied by number of items in brand rows back.
SELECT report_date
,item_cd
,recent_cycle_length
,brand_name
,max(report_date) as latestdate
,avg(recent_cycle_length) over
(partition by brand_name
order by report_date
rows between 11 preceding and current row) AS BRAND_AVG_CYCLE_LENGTH_LAST_12_WEEKS
FROM table
group by recent_cycle_length, brand_name, report_Date,item_cd
;
Because of Snowflake doesn't support a RANGE BETWEEN INTERVAL
, the solution uses the trick from the KB "How to rewrite a RANGE BETWEEN INTERVAL
query in Snowflake".
drop table if exists "table";
create temporary table "table" (
report_date date,
Item number(5),
Brand_name varchar(30),
Cycle_length number(3)
);
insert into "table" values
('2023-09-13', 123, 'Apple', 6),
('2023-09-13', 500, 'Apple', 5),
('2023-09-20', 123, 'Apple', 6),
('2023-09-20', 500, 'Apple', 5),
('2023-09-27', 123, 'Apple', 6),
('2023-09-27', 500, 'Apple', 4),
('2023-10-04', 123, 'Apple', 6),
('2023-10-04', 500, 'Apple', 4),
('2023-09-13', 325, 'Samsung', 7),
('2023-09-13', 862, 'Samsung', 3),
('2023-09-13', 455, 'Samsung', 5),
('2023-09-20', 325, 'Samsung', 7),
('2023-09-20', 862, 'Samsung', 3),
('2023-09-27', 455, 'Samsung', 5),
('2023-10-04', 325, 'Samsung', 7),
('2023-09-27', 862, 'Samsung', 4),
('2023-10-04', 455, 'Samsung', 7),
('2023-10-11', 325, 'Samsung', 7),
('2023-10-04', 862, 'Samsung', 4),
('2023-10-11', 455, 'Samsung', 7);
select
a.*,
case datediff(week, min(b.report_date), max(b.report_date))
when 2 then avg(b.Cycle_length)
end as "3_weeks_avg"
from "table" as a
join "table" as b
on b.Brand_name = a.Brand_name and
b.report_date between a.report_date - interval '2 weeks' and
a.report_date
group by a.report_date, a.Item, a.Brand_name, a.Cycle_length
order by a.Brand_name, a.report_date;
Result:
+-------------+------+------------+--------------+-------------+
| REPORT_DATE | ITEM | BRAND_NAME | CYCLE_LENGTH | 3_weeks_avg |
+-------------+------+------------+--------------+-------------+
| 2023-09-13 | 500 | Apple | 5 | |
| 2023-09-13 | 123 | Apple | 6 | |
| 2023-09-20 | 123 | Apple | 6 | |
| 2023-09-20 | 500 | Apple | 5 | |
| 2023-09-27 | 123 | Apple | 6 | 5.333333 |
| 2023-09-27 | 500 | Apple | 4 | 5.333333 |
| 2023-10-04 | 123 | Apple | 6 | 5.166667 |
| 2023-10-04 | 500 | Apple | 4 | 5.166667 |
| 2023-09-13 | 862 | Samsung | 3 | |
| 2023-09-13 | 455 | Samsung | 5 | |
| 2023-09-13 | 325 | Samsung | 7 | |
| 2023-09-20 | 325 | Samsung | 7 | |
| 2023-09-20 | 862 | Samsung | 3 | |
| 2023-09-27 | 455 | Samsung | 5 | 4.857143 |
| 2023-09-27 | 862 | Samsung | 4 | 4.857143 |
| 2023-10-04 | 325 | Samsung | 7 | 5.285714 |
| 2023-10-04 | 455 | Samsung | 7 | 5.285714 |
| 2023-10-04 | 862 | Samsung | 4 | 5.285714 |
| 2023-10-11 | 325 | Samsung | 7 | 5.857143 |
| 2023-10-11 | 455 | Samsung | 7 | 5.857143 |
+-------------+------+------------+--------------+-------------+