Search code examples
dbt

DBT run model only once


I've created a model to generate a calendar dimension which I only want to run when I explicitly specify to run it.

I tried to use incremental materialisation with nothing in is_incremental() block hoping dbt would do nothing if there was no query to satisfy the temporary view. Unfortunately this didn't work.

Any suggestion or thoughts for how I might achieve this greatly appreciated.

Regards,

Ashley


Solution

  • I've used a tag for this. Let's call this kind of thing a "static" model. In your model:

    {{ config(tags=['static']) }}
    

    and then in your production job:

    dbt run --exclude tag:static
    

    This doesn't quite achieve what you want, since you have to add the selector at the command line. But it's simple and self-documenting, which is nice.

    I think you should be able to hack the incremental materialization to do this. dbt will complain about empty models, but you should be able to return a query with zero records. It'll depend on your RDBMS if this is really much better/faster/cheaper than just running the model, since dbt will still execute a query with the complex merge logic.

    {{ config(materialized='incremental') }}
    
    {% if is_incremental() %}
    select * from {{ this }} limit 0
    {% else %}
    -- your model here, e.g.
    {{ dbt_utils.date_spine( ... ) }}
    {% endif %}
    

    Your last/best option is probably to create a custom materialization that checks for an existing relation and no-ops if it finds one. You could borrow most of the code from the incremental materialization to do this. (You would add this as a macro in your project). Haven't tested this, but to give you an idea:

    -- macros/static_materialization.sql
    {% materialization static, default -%}
    
      -- relations
      {%- set existing_relation = load_cached_relation(this) -%}
      {%- set target_relation = this.incorporate(type='table') -%}
      {%- set temp_relation = make_temp_relation(target_relation)-%}
      {%- set intermediate_relation = make_intermediate_relation(target_relation)-%}
      {%- set backup_relation_type = 'table' if existing_relation is none else existing_relation.type -%}
      {%- set backup_relation = make_backup_relation(target_relation, backup_relation_type) -%}
    
      -- configs
      {%- set unique_key = config.get('unique_key') -%}
      {%- set full_refresh_mode = (should_full_refresh()  or existing_relation.is_view) -%}
      {%- set on_schema_change = incremental_validate_on_schema_change(config.get('on_schema_change'), default='ignore') -%}
    
      -- the temp_ and backup_ relations should not already exist in the database; get_relation
      -- will return None in that case. Otherwise, we get a relation that we can drop
      -- later, before we try to use this name for the current operation. This has to happen before
      -- BEGIN, in a separate transaction
      {%- set preexisting_intermediate_relation = load_cached_relation(intermediate_relation)-%}
      {%- set preexisting_backup_relation = load_cached_relation(backup_relation) -%}
       -- grab current tables grants config for comparision later on
      {% set grant_config = config.get('grants') %}
      {{ drop_relation_if_exists(preexisting_intermediate_relation) }}
      {{ drop_relation_if_exists(preexisting_backup_relation) }}
    
      {{ run_hooks(pre_hooks, inside_transaction=False) }}
    
      -- `BEGIN` happens here:
      {{ run_hooks(pre_hooks, inside_transaction=True) }}
    
      {% set to_drop = [] %}
    
      {% if existing_relation is none %}
          {% set build_sql = get_create_table_as_sql(False, target_relation, sql) %}
      {% elif full_refresh_mode %}
          {% set build_sql = get_create_table_as_sql(False, intermediate_relation, sql) %}
          {% set need_swap = true %}
      {% else %}
        {# ----- only changed the code between these comments ----- #}
        {# NO-OP. An incremental materialization would do a merge here #}
        {% set build_sql = "select 1" %}
        {# ----- only changed the code between these comments ----- #}
    
      {% endif %}
    
      {% call statement("main") %}
          {{ build_sql }}
      {% endcall %}
    
      {% if need_swap %}
          {% do adapter.rename_relation(target_relation, backup_relation) %}
          {% do adapter.rename_relation(intermediate_relation, target_relation) %}
          {% do to_drop.append(backup_relation) %}
      {% endif %}
    
      {% set should_revoke = should_revoke(existing_relation, full_refresh_mode) %}
      {% do apply_grants(target_relation, grant_config, should_revoke=should_revoke) %}
    
      {% do persist_docs(target_relation, model) %}
    
      {% if existing_relation is none or existing_relation.is_view or should_full_refresh() %}
        {% do create_indexes(target_relation) %}
      {% endif %}
    
      {{ run_hooks(post_hooks, inside_transaction=True) }}
    
      -- `COMMIT` happens here
      {% do adapter.commit() %}
    
      {% for rel in to_drop %}
          {% do adapter.drop_relation(rel) %}
      {% endfor %}
    
      {{ run_hooks(post_hooks, inside_transaction=False) }}
    
      {{ return({'relations': [target_relation]}) }}
    
    {%- endmaterialization %}