Search code examples
dbtmeltano

Using dbt and Meltano, how can I prevent multiple dbt job runs from conflicting with each other?


When running dbt jobs in Meltano, dbt run jobs may collide with each other if run out of a triggered context - for instance, when an on-demand job collides with a scheduled job or a CI-based job.

If dbt run operates on the same tables at the same time, this generally causes a crash and sometimes a data quality issue if the same insert is performed twice on a single target table.

Any way to prevent run collisions, using either Meltano functionality or native dbt functionality?


Solution

  • One way is to generate a lock in your target database. Here's an example for MSSQL.

    I chose a on-run-start: hook. This hook attempts to grab a lock for dbt that lasts for the duration of the DB session.

    dbt_project.yml

    on-run-start: 
      - "{{ application_lock() }}" #Be sure only one dbt project runs at a time
    

    Macro: application_lock.sql

    {% macro application_lock() %}
        
    DECLARE 
        @result int,
        @msg varchar(max);
    
    EXEC @result = sp_getapplock @resource = 'dbt', @lockMode = 'Exclusive', @locktimeout = '1200000', @lockowner = 'Session';
    IF @result not in (0, 1)
        BEGIN
        SET @msg = 'Result from sp_getapplock:'+ CAST(@result as varchar(max))+'. Lock could not be taken after waiting 20 minutes. Another DBT process probably has the applock, try again later.';
        THROW 51000, @msg, 1
        END
    
    {% endmacro %}