Search code examples
sqlsnowflake-cloud-data-platform

Snowflake: Adding number of days to start date dynamically only counting particular weekdays


In Snowflake, I have a table with the following columns:

Column Description
ORIGINAL_EXPDATE A date for the original expiry date.
MO_COUNT true/false whether to count Mondays.
TU_COUNT true/false whether to count Tuesdays.
WE_COUNT true/false whether to count Wednesdays.
TH_COUNT true/false whether to count Thursdays.
FR_COUNT true/false whether to count Fridays.
SA_COUNT true/false whether to count Saturdays.
SU_COUNT true/false whether to count Sundays.
NO_RATES No. of counted days to add.

I want to calculate a column END_DATE where I do the following:

To ORIGINAL_EXPDATE, I want to add days according to the NO_RATES, but I only want to consider days that correspond to *_COUNT columns that are TRUE.

For instance ORIGINAL_EXPDATE = 2024-02-05 (a Monday), NO_RATES = 2, MO_COUNT = TRUE, FR_COUNT = TRUE, all other days FALSE. Here, Friday, 9 February should count (because FR_COUNT = TRUE) and Monday, 12 February should count (as MO_COUNT = TRUE). After adding two days that actually count to ORIGINAL_EXPDATE, END_DATE should therefore be 2024-02-12.

Example code with test data:

  with test_data as (
select *
  from (values (1, '2024-02-05'::date, true, false, false, false, true, false, false, 2),--should return 2024-02-12 (count 9th and 12th)
               (2, '2024-02-05'::date, true, true, true, true, true, true, false, 2),--should return 2024-02-07 (count, 6th and 7th)
               (3, '2024-02-05'::date, false, false, false, false, false, true, false, 2))--should return 2024-02-17 (count 10th and 17th)
    as test_data (id, original_expdate, mo_count, tu_count, we_count, th_count, fr_count, sa_count, su_count, no_rates)
),

       test_data_with_weekdays as (
select *
     , dayofweekiso(dateadd(day, 1, original_expdate)) as start_counting_day_of_week
     , dateadd(week, no_rates, original_expdate) as latest_end_date
  from test_data
)

select *
  from test_data_with_weekdays
  • For ID = 1, only Mondays (MO_COUNT = TRUE) and Fridays (FR_COUNT = TRUE) should be counted, so that two counted days later should be the 12th of February.
  • For ID = 2, as 2024-02-05 is a Monday, and TU_COUNT and WE_COUNT are both true, both the 6th (Tuesday) and 7th (Wednesday) of February should count, leading to END_DATE = 2024-02-07.
  • For ID = 3, only Saturdays count, so that two counted days later should be the 17th of February (two Saturdays after ORIGINAL_EXPDATE).

I tried solutions with case_when but they always add the entirety of NO_RATES as soon as I hit the first counted weekday (for instance for ID = 3, I would identify 10th February as first Saturday but then add the NO_RATES in one go, getting to END_DATE = 2024-02-11.

How could I get the intended result in Snowflake? Performance is not important, as long as it works.


Solution

  • There may be a simpler way, but this works.

    Let's break it down.

    1. First we generate a table of 365 rows which assumes we never need to to go past 365 calendar days to reach the limit of No Rates.

    2. We cross join this table to the test data.

    3. We generate a new column End Date by adding the index NDX from the generated rows to the Original ExpDate.

    4. We get the Day Name of the End Date.

    5. We get the corresponding Count for the Day Name and set the Counter to 1 if True or 0 if False.

    6. We generate a running total over the Counter for each partition (row id in the test data) ordered by the End Date.

    7. We take only rows where the running total = No Rates (qualify clause).

    8. We take only the first row (where Counter = 1). If the next row has a Counter value of zero, the running total will still match No Rates. If the next row has a Counter value of 1, the running total will increment to the next value which will be greater than No Rates.

    with test_data(id, original_expdate, mo_count, tu_count, we_count, th_count, fr_count, sa_count, su_count, no_rates) as (
        select *
        from values
            (1, '2024-02-05'::date, true, false, false, false, true, false, false, 2),--should return 2024-02-12 (count 9th and 12th)
            (2, '2024-02-05'::date, true, true, true, true, true, true, false, 2),--should return 2024-02-07 (count, 6th and 7th)
            (3, '2024-02-05'::date, false, false, false, false, false, true, false, 2)--should return 2024-02-17 (count 10th and 17th)
    ),
    t1(ndx) as
      (select row_number() over(order by 0)
       from table(generator(rowcount => 365))) -- assumes you never go past 365 days total
    ,t2 as(
        select
            test_data.id
            , test_data.original_expdate
            , DATEADD(day, t1.ndx, test_data.original_expdate) as END_DATE
            , DAYNAME(END_DATE) as DNM
            , no_rates
            , CASE
                WHEN DNM = 'Mon' THEN IFF(test_data.mo_count,1,0)
                WHEN DNM = 'Tue' THEN IFF(test_data.tu_count,1,0)
                WHEN DNM = 'Wed' THEN IFF(test_data.we_count,1,0)
                WHEN DNM = 'Thu' THEN IFF(test_data.th_count,1,0)
                WHEN DNM = 'Fri' THEN IFF(test_data.fr_count,1,0)
                WHEN DNM = 'Sat' THEN IFF(test_data.sa_count,1,0)
                WHEN DNM = 'Sun' THEN IFF(test_data.su_count,1,0)
            END as CTR
        from test_data
        cross join t1
       )
    select
        id
        , original_expdate
        , end_date
    from t2
    where ctr = 1
    qualify sum(ctr) over (partition by id order by end_date) = no_rates
    order by id;
    
    ID ORIGINAL_EXPDATE END_DATE
    1 2024-02-05 2024-02-12
    2 2024-02-05 2024-02-07
    3 2024-02-05 2024-02-17