Search code examples
databricksdbt

dbt: parametize path of external tables based on the target environment


I'm using dbt + databricks I'm using external delta tables so need to explicitly specify a location for the materialized='table' models

I can't see anything in the doco about how to do this. Are there any defined patterns?

here's the relevant section from the dbt_project.yml

models:
  +file_format: delta
  cdp:
    DataMart:
       +location_root: "abfss://[email protected]/DataMart/"
    MarketAnalytics:
       +location_root: "abfss://[email protected]/MarketAnalytics/"    
       Staging:
          +location_root: "abfss://[email protected]/Staging/"
    Ref:
       +location_root: "abfss://[email protected]/Ref/"

I need to be able to pass a different prod path in the location root when I run the built using the prod target. I dont want to have to manage multiple projects based on the env


Solution

  • This is a common pattern; you have two options for how to achieve this, but the best way is probably to override the macro that dbt uses to build the fully-qualified name/path of the database relation. For most adapters you would override the macro called generate_schema_name, but for Spark adapters, you need to override the macro called location_clause.

    See this answer for more details (there is an example target-aware location_clause macro at the end).

    A simpler, but more repetitive option, is to use a little jinja in your yml files. Note that you cannot call a macro from the jinja inside of yml files, but simple branching logic does work (you need to quote the whole thing)

    models:
      +file_format: delta
      cdp:
        DataMart:
           +location_root: "abfss://[email protected]/DataMart/{{ '' if target.name == 'production' else 'dev/'}}"
        ...