Search code examples
sqlarraysdatetimesnowflake-cloud-data-platformdate-range

Create an array in Snowflake


I want to create an array constructed from two timestamps. After that, use this as a range to find the difference between the first element and the second element.

The goal is to find the time difference between timestamps within business hours (9 am to 17 pm).

What should be the right approach here? Can I create arrays and then find the time difference?

e.g. in the first record, since started_at is after 17pm, the first element of array must be "2021-05-19 09:00:00".

Example data:

WITH t1 AS (
SELECT 'A' AS id, '2021-05-18 18:30:00'::timestamp AS started_at, '2021-05-19 09:30:00'::timestamp AS ended_at UNION ALL
SELECT 'B' AS id, '2021-05-19 15:30:00'::timestamp AS started_at, '2021-05-20 13:00:00'::timestamp AS ended_at
    )
SELECT *
FROM t1

Expected result: enter image description here


Solution

  • The idea is to calculate if we need to extend range, generate rows using lateral flatten and calculate timestamps using case statements.

    Demo:

    WITH t1 AS (
    SELECT 'A' AS id, '2021-05-18 18:30:00'::timestamp AS started_at, '2021-05-19 09:30:00'::timestamp AS ended_at UNION ALL
    SELECT 'B' AS id, '2021-05-19 15:30:00'::timestamp AS started_at, '2021-05-20 13:00:00'::timestamp AS ended_at
        )
        
     select id, started_at, ended_at,
            array_construct(
             case when extend_before=1 and v.index=0 then started_at 
                  when extend_before=1  then (dateadd(day,1,started_at::date)::string||' 09:00:00')::timestamp
                  when extend_before=0 and started_at > (started_at::date::string||' 17:00:00')::timestamp then (dateadd(day,1,started_at::date)::string||' 09:00:00')::timestamp
                 else started_at
             end, --as calculated_started_at,
             case when extend_before=1 and v.index=0 then  (started_at::date::string||' 17:00:00')::timestamp 
                 else ended_at
             end --as calculated_ended_at
           ) as date_range,
           datediff(minute,date_range[0],date_range[1]) minutes_diff
     from
    (
    SELECT t1.*, case when started_at >= (started_at::date::string||' 17:00:00')::timestamp then 0 else 1 end as extend_before
    FROM t1
    ) as s, lateral flatten(input=> split(space(s.extend_before),' '), outer => true) v
    

    Result:

    ID  STARTED_AT              ENDED_AT                 DATE_RANGE                                             MINUTES_DIFF
    A   2021-05-18 18:30:00.000 2021-05-19 09:30:00.000 ["2021-05-19 09:00:00.000", "2021-05-19 09:30:00.000"]  30
    B   2021-05-19 15:30:00.000 2021-05-20 13:00:00.000 ["2021-05-19 15:30:00.000", "2021-05-19 17:00:00.000"]  90
    B   2021-05-19 15:30:00.000 2021-05-20 13:00:00.000 ["2021-05-20 09:00:00.000", "2021-05-20 13:00:00.000"]  240