Search code examples
dbt

How do you write incremental models in DBT?


I have two large tables I'm trying to join & filter using dbt.

The SQL is very simple, roughly:

SELECT
   u.user_id, t.transaction_id
FROM users u
JOIN transactions t ON t.user_id = u.user_id
WHERE u.active = 1

Currently I'm using the "table" materialization, but this is fairly wasteful, as the tables underlying tables are 99.99% the same from run to run.

However, I don't understand from the DBT documentation how I could set this model to "incremental".

Any ideas?

PS. I'm running on SQL Server.


Solution

  • As @anders-swanson wrote in his comment, if transaction_id is definitely unique, you could set it as the unique_key and materialize your model as an incremental table.

    dbt's docs explain how to do this. Using your example, it might be:

    {{
        config(
            materialized='incremental',
            unique_key='transaction_id'
        )
    }}
    
    select
       u.user_id, t.transaction_id
    from users u
    join transactions t ON t.user_id = u.user_id
    where u.active = 1
    

    If transaction_id is not unique but transaction_id||user_id is, you could try creating a new column which concatenates these columns in an upstream dbt model, and then assigning that as the unique_key:

    {{
        config(
            materialized='incremental',
            unique_key='pkey'
        )
    }}
    
    select
       u.user_id, 
       t.transaction_id,
       u.user_id||t.transaction_id as pkey
    from users u
    join transactions t ON t.user_id = u.user_id
    where u.active = 1
    

    Otherwise, you'll have to pull in a column that is either a) unique, or b) has an ordered quality that could be used to apply an is_incremental() filter (like @viacheslav-nefedov wrote).