Search code examples
azure-blob-storageazure-synapsedbtazure-data-lake-gen2

How to load parquet files from Azure Data Lake Gen2/Azure Blob Storage to Dedicated pool using dbt?


I'm using dbt-synapse: https://github.com/dbt-msft/dbt-synapse I want to load my parquet file from Azure Data Lake Gen2/Azure Blob Storage to the dedicated pool in Azure Synapse. I tried to use

{{ config(materialized='table') }}
with my_table as (
    SELECT
        TOP 100 *
    FROM
        OPENROWSET(
            BULK 'myparquetfile_url',
            FORMAT = 'PARQUET'
        ) AS [result]
)

select * from my_table

but no luck.

It shows this error

('42000', "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Parse error at line: 7, column: 9: Incorrect syntax near 'OPENROWSET'. (103010) (SQLExecDirectW)")

So, how can I load parquets file from blob storage to dedicated pool using dbt?


Solution

  • Unfortunately OPENROWSET isn't supported in Azure Synapse Analytics.

    There is a GitHub repository which you can use to load external tables using dbt. This you can use as an alternative.

    https://github.com/dbt-labs/dbt-external-tables

    Supported database:

    • Redshift (Spectrum)
    • Snowflake
    • BigQuery
    • Spark
    • Synapse
    • Azure SQL