Search code examples
dbt

DBT: conditionally set schema config


I'm trying to determine how I can conditionally set schema config attributes. I've attempted this by a macro in both dbt_project.yml and also in schema.yml but both of these methods fail with:

00:23:19  Encountered an error:
Compilation Error
  Could not render {{get_location_root('lndv')}}: 'get_location_root' is undefined

The outcome I would like to achieve is conditionally setting location_root for Spark for various schemas. I want different locations for each environment. I thought the macro path was the best fit as this follows a pattern but it obviously doesn't work in dbt_project.yml or property files. I was using target.name to determine environment. It's in the same directory as other macros that are successfully rendering in models so the path is set correctly. I don't really want to resort to placing this config in each model if I can avoid it.

Does anyone have any thoughts on how I can solve this? Either getting the macro to work in dbt_project.yml / schema.yml or by some other method?

Regards,

Ashley


Solution

  • dbt only allows a small subset of jinja in .yml files. In particular, you can't use macros. But you can use simple conditionals. Jinja that appears in .yml files must be quoted:

    schema: "{{ 'prod_schema' if target.name == 'production' else 'dev_schema' }}"
    

    Another option for you is to override the built-in macro that generates schema names. There is a great write-up in the dbt docs on this topic.

    From the docs:

    If your dbt project includes a macro that is also named generate_schema_name, dbt will always use the macro in your dbt project instead of the default macro.

    Therefore, to change the way dbt generates a schema name, you should add a macro named generate_schema_name to your project, where you can then define your own logic.

    There is even an alternative "non-default" version of this macro that ships with dbt, called generate_schema_name_for_env, with the logic:

    In prod:

    • If a custom schema is provided, a model's schema name should match the custom schema, rather than being concatenated to the target schema.
    • If no custom schema is provided, a model's schema name should match the target schema.

    In other environments (e.g. dev or qa):

    • Build all models in the target schema, as in, ignore custom schema configurations.

    To use generate_schema_name_for_env, you create a new macro in your project with the following contents:

    -- put this in macros/generate_schema_name.sql
    
    {% macro generate_schema_name(custom_schema_name, node) -%}
        {{ generate_schema_name_for_env(custom_schema_name, node) }}
    {%- endmacro %}
    

    EDIT: In Spark, you can use a similar trick to set the "location" of the materialized model by overriding the location_clause macro (which is part of the dbt-spark adapter). Your macro should template to a string with the word "location" followed by a path wrapped in single quotes:

    {% macro location_clause() %}
      {%- set location_root = config.get('location_root', validator=validation.any[basestring]) -%}
      {%- set identifier = model['alias'] -%}
      {%- if location_root is not none and target.name == "production" %}
        location '{{ location_root }}/prod/{{ identifier }}'
      {%- elif location_root is not none %}
        location '{{ location_root }}/dev/{{ identifier }}'
      {%- endif %}
    {%- endmacro -%}