Search code examples
snowflake-cloud-data-platformdbt

dbt external tables not being executed


I have been trying to define an external table to be used on dbt with the package dbt_external_tables. I did not get an error when I tried to execute the command

dbt run-operation stage_external_sources --args "select: external_s3_stage.tbl_s3_stage"

I just got the message No external sources selected with each run. I suspect my issue is related to the location of my yml file.

Is there another way to define which file is to be executed with the parameter of the external table or is there any other configuration that I should add to my dbt_project.yml?

packages.yml

packages:
  - package: dbt-labs/dbt_external_tables
    version: 0.8.7

models/staging/ext_stage_source.yml

version: 2
sources:
  - name: external_s3_stage
    database: test_conn
    schema: public
    loader: S3
    tables:
      - name: tbl_s3_stage
        ext_full_refresh: true
        description: This is the first external table created as stage from dbt.
        external:
        location: "s3://bucket/data/"  
        file_format: >
          (TYPE = CSV FIELD_OPTIONALLY_ENCLOSED_BY = '"' SKIP_HEADER = 1 TRIM_SPACE = TRUE ESCAPE_UNENCLOSED_FIELD = NONE FIELD_DELIMITER = ',')
        columns:
          - name: id
            data_type: varchar 
          - name: name
            data_type: varchar 

My log message looks like this:

06:40:18  Running with dbt=1.6.3
06:40:18  Registered adapter: snowflake=1.6.3
06:40:18  Found 2 models, 4 tests, 1 source, 0 exposures, 0 metrics, 419 macros, 0 groups, 0 semantic models
06:40:18  No external sources selected

I also wanted to mention that I am able to execute dbt debug and dbt run with no issues. I am using dbt-core for this task.


Solution

  • Solution

    Now it works. I just reviewed on the already answered questions at the Slack group of dbt, and I saw that my yml file required an extra indented.

    Also, based on the official docs, I got that we need to have an Stage table already created and working, it means integration and stage. To create them please follow the official docs.

    And finally to execute it I used the next line dbt run-operation stage_external_sources

    version: 2
    sources:
      - name: external_s3
        database: T_CONNECTION
        schema: T_SCHEMA
        loader: S3
        tables:
          - name: my_s3_stage01
            ext_full_refresh: true
            description: This is the first external table created as stage from dbt.
            columns:
              - name: id
                data_type: varchar 
              - name: email
                data_type: varchar 
            external:
              location: "@MY_S3_STAGE1"  
              file_format: '( type = csv skip_header = 1 )'
              
    

    Results

    enter image description here

    PD In case you execute a full_refresh and it is not refreshing the data and just SKIPPING, you can add the next var to your dbt command dbt run-operation stage_external_sources --vars "ext_full_refresh: true"