Search code examples
snowflake-cloud-data-platformdbt

Convert time PST/PDT to timestamp using Snowflake


Background: I have the below table of data where I'm trying to concat the order_date and transaction_time columns to create a final timestamp column

Problem: There is a PST/PDT string in the transaction_time column. I am trying to convert my final timestamp column(VARCHAR) into a UTC timestamp

My attempted solution that didn't work:

select
  transaction_date
  , to_date(transaction_date, 'mon dd, yyyy')    as order_date
  , transaction_time
  , concat(transaction_date, ' ', transaction_time)   as timestamp
--  , to_timestamp_tz(concat(transaction_date, ' ', transaction_time), 'mon dd, yyyy hh:mm:ss am pdt')   as final_timestamp
from raw_db.schema_name.table_name

Please help?? Thank you!!

enter image description here


Solution

  • So PST and PDT are not valid iana timezone's which is what is expected by the Timestamp Formats, so you cannot use the inbuilt functions to handle that, but you can work around it.

    SELECT time
        ,try_to_timestamp(time, 'YYYY-MM-DD HH12:MI:SS AM PDT') as pdt_time
        ,try_to_timestamp(time, 'YYYY-MM-DD HH12:MI:SS AM PST') as pst_time
        ,dateadd('hour',7, pdt_time) as pdt_as_utc_time
        ,dateadd('hour',8, pst_time) as pst_as_utc_time
        ,coalesce(pdt_as_utc_time, pst_as_utc_time) as utc_time1
        ,iff(substr(time, -3) = 'PDT', pdt_as_utc_time, pst_as_utc_time ) as utc_time2
    FROM VALUES
        ('2020-10-28 7:25:44 AM PDT'),
    -- insert more rows here...
        ('2020-11-06 6:35:18 PM PST') 
        v(time);
    

    shows two ways to get a unified UTC time from the two.

    which could be shortened to:

    SELECT time
        ,coalesce(dateadd('hour',7, try_to_timestamp(time, 'YYYY-MM-DD HH12:MI:SS AM PDT')), dateadd('hour',8, try_to_timestamp(time, 'YYYY-MM-DD HH12:MI:SS AM PST'))) as utc_time1
        ,iff(substr(time, -3) = 'PDT',dateadd('hour',7, try_to_timestamp(time, 'YYYY-MM-DD HH12:MI:SS AM PDT')), dateadd('hour',8, try_to_timestamp(time, 'YYYY-MM-DD HH12:MI:SS AM PST')) ) as utc_time2
    FROM VALUES
        ('2020-10-28 7:25:44 AM PDT'),
        ('2020-11-06 6:35:18 PM PST') 
        v(time);   
    

    which gives:

    TIME                        UTC_TIME1             UTC_TIME2
    2020-10-28 7:25:44 AM PDT   2020-10-28 14:25:44   2020-10-28 14:25:44
    2020-11-06 6:35:18 PM PST   2020-11-07 02:35:18   2020-11-07 02:35:18
    

    As Per my comment if you have more TIMEZONE you need to support, lets say New Zealand's two timeszones ;-) then a CASE would be more suitable

    SELECT time
        ,substr(time, -4) as tz_str -- longer and NZxT is longer
        ,CASE
           WHEN tz_str = ' PDT' THEN dateadd('hour',7, try_to_timestamp_ntz(time, 'YYYY-MM-DD HH12:MI:SS AM PDT'))
           WHEN tz_str = ' PST' THEN dateadd('hour',8, try_to_timestamp_ntz(time, 'YYYY-MM-DD HH12:MI:SS AM PST'))
           WHEN tz_str = 'NZDT' THEN dateadd('hour',-13, try_to_timestamp_ntz(time, 'YYYY-MM-DD HH12:MI:SS AM NZDT'))
           WHEN tz_str = 'NZST' THEN dateadd('hour',-12, try_to_timestamp_ntz(time, 'YYYY-MM-DD HH12:MI:SS AM NZST'))
       END as utc_time
    FROM VALUES
        ('2020-10-28 7:25:44 AM PDT'),
        ('2020-11-06 6:35:18 PM PST'),
        ('2021-04-23 2:45:44 PM NZST'),
        ('2021-01-23 2:45:44 PM NZDT')
        v(time);
    

    OR you could use a regex to match up to the AM/PM part of the date time like in this SO Question/Answer, and have just one try_to_timestamp_ntz and just use the CASE to correct based of the suffix.