Search code examples
amazon-redshiftdbt

Why am I getting a "column <x> of relation <y> does not exist" error when running a model in dbt but not when running in a SQL client?


Issue

We have an incremental model that's been running in our nightly production job for months (SQL below). Last week, after upgrading our production environment to v0.21.0 (from v0.19.0), the model began throwing this error:

Database Error in model my_model (path/to/my_model.sql)
  column "alias6" of relation "my_model" does not exist
  compiled SQL at target/run/path/to/my_model.sql
{{-
  config(
    materialized = 'incremental',
    dist = 'alias3',
    sort = 'dates_pk',
    unique_key = '__surrogate_key',
    )
-}}

with calculate_metrics as (

    select        
        field1 as dates_pk,
        field2 as alias2,
        {{ my_macro('field3') }} as alias3,
        field4 as alias4,
        field5,
        field6 as alias6,
        field7 as alias7,
        field8 as alias8,
        field9 as alias9,
        (field8::float / field6)::decimal(18, 6) as alias10,
        (field9::float / field7)::decimal(18, 6) as alias11,
        {{ dbt_utils.surrogate_key([
            'field1', 'field2', 'alias4', 'field5']) }} as __surrogate_key
    from {{ ref('upstream_model') }}
    {% if is_incremental() -%}
    where dates_pk >= coalesce((select max(dates_pk) from {{ this }}), '2000-01-01')
    {%- endif -%}

)

select * from calculate_metrics

What I've tried so far

  • prod env, v0.21.0. Since the error looked similar to what happens when the schema of an incremental model changes (which it didn't), I did dbt run -s my_model+ --full-refresh. Got the same error.
  • dev env, v0.21.0. Dropped my dev schemas, ran dbt clean && dbt deps, and alternately did dbt run and dbt build. Failed on the same model.
  • dev env, v0.21.0. Tried dbt build -s my_model --full-refresh; no luck.
  • I ran the compiled SQL from the target/ directory in Redshift from a SQL client. Did not get an error.
  • dev env, v0.19.0. Downgraded my dev env and ran the model lineage from scratch. Failed on the same model. This threw me; I really suspected the upgrade.
  • dev env, v0.19.0. Changed the model materialization to table and ran the model lineage from scratch in dev. Got the same error.
  • dev env, v0.19.0. Pared the code way back to see if I could do anything to get it to run. This, at least, threw a slightly different error.
{{-
  config(
    materialized = 'table',
    )
-}}

with calculate_metrics as (

    select        
        field1,
        field2,
        {{ my_macro('field3') }} as alias3,
        field4,
        field5,
        field6,
        field7,
        field8,
        field9
    from {{ ref('upstream_model') }}

)

select * from calculate_metrics

Database Error in model my_model (path/to/my_model.sql)
  column "dates_pk" of relation "my_model" does not exist
  compiled SQL at target/run/path/to/my_model.sql

So dbt seems to be running some db operation that's looking for aliases that were previously in the model, but aren't now. Not sure why that would be, with a table materialization.


Solution

  • Thanks to some sleuthing by the dbt support team, we discovered the cause of this issue.

    The YML docs file for the model contained a column name not found in the model (see sample YML below), and dbt's persist_docs feature had recently been enabled for the model.

    models:
      - name: my_model
        columns:
          ...
          # This should have been named alias6, but was not updated 
          # when the model changed at some point in the past; this
          # didn't cause an error until persist_docs attempted to
          # `comment` on the (non-existent) field in Redshift
          - name: field6
            description: Foo bar baz.
    

    A more specific error message would have helped get to the bottom of this quicker, so I logged this issue on the dbt-core Github.