Search code examples
sqljinja2amazon-redshiftdbt

Current table name retrieval in dbt project file


I'm trying to get the model names in the post-hook action to be able to store the latest model run times as into the audit table. I'm adding the post hook query in the dbt_project.yml file since I need the hook to be triggered for all the models.

models:
  +bind: false
  data_warehouse:
        data_mart:
          +post_hook:
            - "INSERT INTO {{ target.schema }}__audit._model_run (model_name, run_time) SELECT '{{ this }}', GETDATE()"
          schema: 'data_mart'
          materialized: view

The problem is that {{ this }} does not work, when I build the models it returns nothing for {{ this }} block:

INSERT INTO dev__audit._model_run (model_name, run_time) SELECT , GETDATE()

Tried to insert {{ this }} without the quotes as well, still returns nothing. Any ideas what could be a solution here?


Solution

  • EDIT- Turns out this is simpler than originally thought. Just move your query into a macro. References to this will work as long as you're not putting it directly into dbt_project.yml.

    models:
      +bind: false
      data_warehouse:
            data_mart:
              +post_hook:
                - "{{ log_results() }}"
              schema: 'data_mart'
              materialized: view
    

    Macro:

    {% macro log_results() %}
          INSERT INTO {{ target.schema }}__audit._model_run SELECT '{{ this }}', GETDATE();
    {% endmacro %}
    

    Leaving original answer to give more options


    this is not available inside the dbt_project.yml context. You can only really reference target, DBT built ins, and environment variables. (Ref: https://docs.getdbt.com/reference/dbt-jinja-functions/dbt-project-yml-context)

    You have two options:

    1) Model Post Hook

    The post-hook within the model has access to this. You can simply move your insert statement to the config within the models themselves.

    {{
      config({
        "materialized": 'table',
        "post-hook": "INSERT INTO {{ target.schema }}__audit._model_run (model_name, run_time) SELECT '{{ this }}', GETDATE();"
      })
    }}
    

    Of course, this can be tedious and repetitive.

    2) Use on-run-end

    Instead of writing your result out at the end of each model, you can write out all results at the end of the job run. on-run-end has access to a results variable that you can loop through for each node in the job and get the job name, execution time, plus several other details. See: https://docs.getdbt.com/reference/dbt-jinja-functions/on-run-end-context#results

    Inside dbt_project.yml:

    on-run-end:
      - "{{ log_results(results, schemas) }}"
    

    Then create a macro to write the results:

    {% macro log_results(results, schemas) %}
      {% if execute %}
        {% set schema = schemas[0] %}
        {% for res in results %}
          INSERT INTO {{ schema }}__audit._model_run SELECT '{{ res.node.name }}', {{ res.execution_time }};
        {% endfor %}
      {% endif %}
    {% endmacro %}
    

    Full list of what is in the results object: https://docs.getdbt.com/reference/dbt-classes#result-objects

    And here is where you can find what is available under results.node: https://docs.getdbt.com/reference/artifacts/manifest-json#resource-details