Search code examples
snowflake-cloud-data-platformdbt

Using multiple columns in a Unique_Key for incremental loading in DBT


For incremental models, the DBT documentation here says:

The unique_key should be supplied in your model definition as a string representing a simple column or a list of single quoted column names that can be used together, for example, ['col1', 'col2', …])

I've built an incremental model in DBT with this incremental definition

{{
  config(
    materialized='incremental',
    unique_key = ['Col1', 'Col2', 'Col3']
  )
}}

Which compiles into this merge statement in in Snowflake:

using DW_DEV.dbt_dgarrison_DATA_STAGING.MY_TABLE__dbt_tmp as DBT_INTERNAL_SOURCE
    on 
        DBT_INTERNAL_SOURCE.['Col1', 'Col2', 'Col3'] = DBT_INTERNAL_DEST.['Col1', 'Col2', 'Col3']
...

And this reasonably throws a SQL ERROR complaining about the brackets:

SQL compilation error: syntax error line 4 at position 32 unexpected '['. syntax error line 4 at position 45 unexpected ','. syntax error line 4 at position 98 unexpected '['. syntax error line 4 at position 111 unexpected ','.

I can't find any other good examples using multiple columns this way. (there are options involving concatenating columns, and I'm open to recommendations on the best approach to that, but I'm trying to figure out how to use the DBT recommended syntax)


Solution

  • As part of dbt-core 1.1.0, we can now pass a list to the unique_key statement in incremental models. See the original issue here.

    This means that you should be able to achieve your goal by updating dbt-core and your dbt-<adapter> version locally; or updating your dbt Cloud version accordingly, to 1.1.0, since given the error you get, it looks like unique_key is still looking for a single string instead of an array.