In Snowflake, I have a table with the following columns:
Column | Description |
---|---|
ORIGINAL_EXPDATE | A date for the original expiry date. |
MO_COUNT | true/false whether to count Mondays. |
TU_COUNT | true/false whether to count Tuesdays. |
WE_COUNT | true/false whether to count Wednesdays. |
TH_COUNT | true/false whether to count Thursdays. |
FR_COUNT | true/false whether to count Fridays. |
SA_COUNT | true/false whether to count Saturdays. |
SU_COUNT | true/false whether to count Sundays. |
NO_RATES | No. of counted days to add. |
I want to calculate a column END_DATE
where I do the following:
To ORIGINAL_EXPDATE
, I want to add days according to the NO_RATES
, but I only want to consider days that correspond to *_COUNT
columns that are TRUE
.
For instance ORIGINAL_EXPDATE = 2024-02-05
(a Monday), NO_RATES = 2
, MO_COUNT = TRUE
, FR_COUNT = TRUE
, all other days FALSE
. Here, Friday, 9 February should count (because FR_COUNT = TRUE
) and Monday, 12 February should count (as MO_COUNT = TRUE
). After adding two days that actually count to ORIGINAL_EXPDATE
, END_DATE
should therefore be 2024-02-12.
Example code with test data:
with test_data as (
select *
from (values (1, '2024-02-05'::date, true, false, false, false, true, false, false, 2),--should return 2024-02-12 (count 9th and 12th)
(2, '2024-02-05'::date, true, true, true, true, true, true, false, 2),--should return 2024-02-07 (count, 6th and 7th)
(3, '2024-02-05'::date, false, false, false, false, false, true, false, 2))--should return 2024-02-17 (count 10th and 17th)
as test_data (id, original_expdate, mo_count, tu_count, we_count, th_count, fr_count, sa_count, su_count, no_rates)
),
test_data_with_weekdays as (
select *
, dayofweekiso(dateadd(day, 1, original_expdate)) as start_counting_day_of_week
, dateadd(week, no_rates, original_expdate) as latest_end_date
from test_data
)
select *
from test_data_with_weekdays
ID = 1
, only Mondays (MO_COUNT = TRUE
) and Fridays (FR_COUNT = TRUE
) should be counted, so that two counted days later should be the
12th of February.ID = 2
, as 2024-02-05 is a Monday, and TU_COUNT
and
WE_COUNT
are both true, both the 6th (Tuesday) and 7th (Wednesday) of
February should count, leading to END_DATE = 2024-02-07
.ID = 3
, only Saturdays count, so that two counted days later should be the 17th of
February (two Saturdays after ORIGINAL_EXPDATE
).I tried solutions with case_when
but they always add the entirety of NO_RATES
as soon as I hit the first counted weekday (for instance for ID = 3
, I would identify 10th February as first Saturday but then add the NO_RATES
in one go, getting to END_DATE = 2024-02-11
.
How could I get the intended result in Snowflake? Performance is not important, as long as it works.
There may be a simpler way, but this works.
Let's break it down.
First we generate a table of 365 rows which assumes we never need to to go past 365 calendar days to reach the limit of No Rates.
We cross join this table to the test data.
We generate a new column End Date by adding the index NDX from the generated rows to the Original ExpDate.
We get the Day Name of the End Date.
We get the corresponding Count for the Day Name and set the Counter to 1 if True or 0 if False.
We generate a running total over the Counter for each partition (row id in the test data) ordered by the End Date.
We take only rows where the running total = No Rates (qualify clause).
We take only the first row (where Counter = 1). If the next row has a Counter value of zero, the running total will still match No Rates. If the next row has a Counter value of 1, the running total will increment to the next value which will be greater than No Rates.
with test_data(id, original_expdate, mo_count, tu_count, we_count, th_count, fr_count, sa_count, su_count, no_rates) as (
select *
from values
(1, '2024-02-05'::date, true, false, false, false, true, false, false, 2),--should return 2024-02-12 (count 9th and 12th)
(2, '2024-02-05'::date, true, true, true, true, true, true, false, 2),--should return 2024-02-07 (count, 6th and 7th)
(3, '2024-02-05'::date, false, false, false, false, false, true, false, 2)--should return 2024-02-17 (count 10th and 17th)
),
t1(ndx) as
(select row_number() over(order by 0)
from table(generator(rowcount => 365))) -- assumes you never go past 365 days total
,t2 as(
select
test_data.id
, test_data.original_expdate
, DATEADD(day, t1.ndx, test_data.original_expdate) as END_DATE
, DAYNAME(END_DATE) as DNM
, no_rates
, CASE
WHEN DNM = 'Mon' THEN IFF(test_data.mo_count,1,0)
WHEN DNM = 'Tue' THEN IFF(test_data.tu_count,1,0)
WHEN DNM = 'Wed' THEN IFF(test_data.we_count,1,0)
WHEN DNM = 'Thu' THEN IFF(test_data.th_count,1,0)
WHEN DNM = 'Fri' THEN IFF(test_data.fr_count,1,0)
WHEN DNM = 'Sat' THEN IFF(test_data.sa_count,1,0)
WHEN DNM = 'Sun' THEN IFF(test_data.su_count,1,0)
END as CTR
from test_data
cross join t1
)
select
id
, original_expdate
, end_date
from t2
where ctr = 1
qualify sum(ctr) over (partition by id order by end_date) = no_rates
order by id;
ID | ORIGINAL_EXPDATE | END_DATE |
---|---|---|
1 | 2024-02-05 | 2024-02-12 |
2 | 2024-02-05 | 2024-02-07 |
3 | 2024-02-05 | 2024-02-17 |