Search code examples
dbt

dbt-expectations column_type timestamptz test improperly fails


Using a sql file

SELECT 
    holiday::text
    , observed::date
    , NOW()::timestamptz updated_at
FROM {{ ref('seed_holidays') }}

Upon dbt run -s stage (where stage is the model/schema) I get as expected a timestamptz column in the corresponding table (stg_holidays) with value like '2023-07-23 14:37:29.060 +1000', and inspecting the data_type via DBeaver confirms it's type timestamptz.

Next, running dbt test -s stage, with a spec in the model stg_schema.yml:

- name: stg_holidays
    description: "Victorian State Holidays"
    columns:
      - name: holiday
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: text
      - name: observed
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: date
      - name: updated_at
        tests:
          - not_null
          - dbt_expectations.expect_column_values_to_be_of_type:
              column_type: timestamptz

I get an error:

Failure in test
dbt_expectations_expect_column_values_to_be_of_type_stg_holidays_updated_at__timestamptz
(models/stage/stg_schema.yml)
Got 1 result, configured to fail if != 0

compiled Code at target/compiled/[proj_name]/models/stage/stg_schema.yml/dbt_expectations_expect_column_[hash].sql

Running the test query at the path given is pretty useless:

with relation_columns as (

        
        select
            cast('HOLIDAY' as TEXT) as relation_column,
            cast('TEXT' as TEXT) as relation_column_type
        union all
        
        select
            cast('OBSERVED' as TEXT) as relation_column,
            cast('DATE' as TEXT) as relation_column_type
        union all
        
        select
            cast('UPDATED_AT' as TEXT) as relation_column,
            cast('TIMESTAMP WITH TIME ZONE' as TEXT) as relation_column_type
        
        
    ),
    test_data as (

        select
            *
        from
            relation_columns
        where
            relation_column = 'UPDATED_AT'
            and
            relation_column_type not in ('TIMESTAMPTZ')

    )
    select *
    from test_data

Obviously this returns | relation_column | relation_column_type | | --------------- | -------------------- | | UPDATED_AT | TIMESTAMP WITH TIME ZONE |

The same thing happens with just timestamp type.

Why do these timestamp and timestamp columns fail dbt-expectations tests?


Solution

  • The answer only occurred to me while documenting for the question, but I thought I'd post question and answer to save others the irritation.

    for timestamptz (under postgresql) the dbt-expectations test has to be

      - name: stg_holidays
        columns:
          ...
          - name: updated_at
            tests:
              - not_null
              - dbt_expectations.expect_column_values_to_be_of_type:
                  column_type: timestamp with time zone
    

    or column_type: timestamp without time zone for timestamp.

    This hadn't jumped out at me in combing the dbt-expectations docs, but it does solve the problem.