Search code examples
amazon-web-servicesamazon-redshiftdbt

dbt redshift unload_table macro database issue


Tring to use the dbt-redshift unload_table macro. I am getting 22:51:02 Database Error in model unload_test (models\unload_test.sql) 22:51:02 zero-length delimited identifier at or near """" 22:51:02 LINE 1: /* {"app": "dbt", "dbt_version": "1.0.0", "profile_name": "d...

unload_test.sql follows:

{{
  config({
    "materialized":"table",
    "post-hook": [
      "{{ redshift.unload_table(test_ods,
                                user,
                                s3_path='s3:/cj-pilot-test',
                                iam_role='default',
                                header='True',
                                delimiter=',') }}"
    ]
  })
}}
select *
from test_ods.user

Any help or hints on how to debug or resolve this issue would be appreciated.

Running with dbt=1.0.0 and packages

packages:
  - package: dbt-labs/dbt_utils
    version: [">=0.8.0", "<0.9.0"]
    
  - package: dbt-labs/redshift
    version: 0.6.0

Solution

  • I've not used the unload_table macro before, but a glance, it looks like your arguments may not be properly formed.

    The first two arguments for the schema and table should be passed as string literals. So for your example, the test_ods schema should be passed with single quotes: 'test_ods'.

    A generic example:

    {{ redshift.unload_table('some_schema_name',
                                    'some_table_name',
                                    s3_path='s3://some-bucket-name',
                                    iam_role='arn_for_some_role',
                                    header=True,
                                    delimiter=',') }}
    

    The code inside of the double curly braces is interpreted as Python code, if test_ods is passed without the quotes, it will be interpreted as a reference to a python variable, which doesn't exist.

    Alternatively, if you want to pass the schema and table of your current model as arguments, you can reference those by passing this.schema, or this.table (without the quotes). this is an in-scope python object that contains attributes about the target model currently being executed.

    {{ redshift.unload_table(this.schema,
                                    this.table,
                                    s3_path='s3://some-bucket-name',
                                    iam_role='arn_for_some_role',
                                    header=True,
                                    delimiter=',') }}
    

    Some of your other arguments may also need some tweaking. For example the header argument is being passed as a string literal and technically should be a boolean True instead. Redshift may coerce this to a boolean behind the scenes, but just something to look at. Also, your S3 bucket address might need to have two forward slashes instead of one: s3://

    Here's a link to documentation that describes what data types the unload_table macro is expecting: https://github.com/dbt-labs/redshift/tree/0.2.3/#unload_table-source

    A good way to debug these type of issues is to look at the SQL that's being executed. It's usually more obvious which part of your code might be causing an issue when you see how it's being rendered. The logs are generally the best place to see the executed SQL. Here's a helpful page on where to look: https://docs.getdbt.com/faqs/checking-logs

    Hope this helps, and good luck resolving your issue.