Search code examples
snowflake-cloud-data-platformutctimezone-offset

Force UTC time in Snowflake JOIN condition


I have a setup where we have

  1. A landing table with a column of type TIMESTAMP_LTZ (consumption_date). Includes the timezone of +02:00
  2. A view (landing_view) that reads from the landing table
  3. A view (raw_data) that reads from a table that has a field of type TIMESTAMP_NTZ (SOURCE_TIMESTAMP), but the value itself is in UTC time.

I have to join the data from landing_view to the data from raw_data using the consumption_date and SOURCE_TIMESTAMP.

SELECT l.ID, l.consumption_date, l.RUN_TIME, r.DISPLAY_NAME, r.source_timestamp, r.value_as_double
FROM "raw_data" r
JOIN "landing_view" l
ON r.SOURCE_TIMESTAMP >= DATEADD(second,120, convert_timezone('UTC',l.consumption_date))
  and r.SOURCE_TIMESTAMP < DATEADD(second,1000, convert_timezone('UTC',l.consumption_date))

My problem is that the convert_timezone command does not seem to affect the join clause at all, insted the join is made using the local time included in the LTZ type (+02:00).

If I use the convert_timezone is a select, if works just fine, but for the JOIN it does not.

Is there a way I can tell snowflake to use UTC in the join?


Solution

  • This will depend on what your TIMEZONE setting is. See example below.

    If TIMEZONE is UTC:

    alter session set TIMEZONE = 'UTC';
    select 
        -- 2AM UTC
        '2021-01-02 02:00:00'::timestamp_ntz as SOURCE_TIMESTAMP,
        -- 1AM UTC / 12PM Australia/Melbourne time / 1 hour before SOURCE_TIMESTAMP
        '2021-01-02 12:00:00 +1100'::timestamp_ltz as CONSUMPTION_DATE,
        -- so add one hour to CONSUMPTION_DATE should equal to SOURCE_TIMESTAMP
        SOURCE_TIMESTAMP = DATEADD(hour, 1, convert_timezone('UTC', CONSUMPTION_DATE)) as is_equal
    ;
    
    +-------------------------------+-------------------------------+----------+
    | SOURCE_TIMESTAMP              | CONSUMPTION_DATE              | IS_EQUAL |
    |-------------------------------+-------------------------------+----------|
    | 2021-01-02 02:00:00.000000000 | 2021-01-02 01:00:00.000 +0000 | True     |
    +-------------------------------+-------------------------------+----------+
    

    However, if you change your TIMEZONE setting to another timezone, the result will be different:

    alter session set TIMEZONE = 'Australia/Melbourne';
    select 
        -- 2AM UTC
        '2021-01-02 02:00:00'::timestamp_ntz as SOURCE_TIMESTAMP,
        -- 1AM UTC / 12PM Australia/Melbourne time / 1 hour before SOURCE_TIMESTAMP
        '2021-01-02 12:00:00 +1100'::timestamp_ltz as CONSUMPTION_DATE,
        -- so add one hour to CONSUMPTION_DATE should equal to SOURCE_TIMESTAMP
        SOURCE_TIMESTAMP = DATEADD(hour, 1, convert_timezone('UTC', CONSUMPTION_DATE)) as is_equal
    ;
    
    +-------------------------------+-------------------------------+----------+
    | SOURCE_TIMESTAMP              | CONSUMPTION_DATE              | IS_EQUAL |
    |-------------------------------+-------------------------------+----------|
    | 2021-01-02 02:00:00.000000000 | 2021-01-02 12:00:00.000 +1100 | False    |
    +-------------------------------+-------------------------------+----------+
    

    Since you SOURCE_TIMESTAMP stores UTC value, you should change your TIMEZONE setting to match it.

    By the way, having CONVERT_TIMEZONE in the DATEADD is redundant, as it only adds extra operation, but not having any effects. See below example:

    select 
        -- 1AM UTC / 9AM Australia/Perth time / 1 hour before SOURCE_TIMESTAMP
        '2021-01-02 09:00:00 +0800'::timestamp_ltz as CONSUMPTION_DATE,
        DATEADD(hour, 1, CONSUMPTION_DATE) as no_convert_tz,
        DATEADD(hour, 1, convert_timezone('UTC', CONSUMPTION_DATE)) as convert_tz,
        no_convert_tz = convert_tz
    ;
    
    +-------------------------------+-------------------------------+-------------------------------+----------------------------+
    | CONSUMPTION_DATE              | NO_CONVERT_TZ                 | CONVERT_TZ                    | NO_CONVERT_TZ = CONVERT_TZ |
    |-------------------------------+-------------------------------+-------------------------------+----------------------------|
    | 2021-01-02 12:00:00.000 +1100 | 2021-01-02 13:00:00.000 +1100 | 2021-01-02 02:00:00.000 +0000 | True                       |
    +-------------------------------+-------------------------------+-------------------------------+----------------------------+
    

    You can see that the last column returns True.