Search code examples
azure-sql-databaseazure-blob-storage

SQL Server managed instance try to access parquet files on blob storage


I have a SQL Server Managed Instance in Azure and have a lot of parquet files on my Blob Storage. I tried to create an external table linked to these files, but I can't. After followed a tutorial on how-to create an external table, I had problems creating a file format on SQL. I used the sintax:

CREATE EXTERNAL FILE FORMAT MyParquetFormat
WITH (
    FORMAT_TYPE = DELIMITEDTEXT,
    FORMAT_OPTIONS(
    FIELD_TERMINATOR = '|',
    STRING_DELIMITER = '"',
    ENCODING = 'UTF8'
    )
);

When running this command received a msg:

Msg 102, Level 15, State 1, Line 1
Incorrect syntax near 'EXTERNAL'.

I tried to find some Azure documentation about the connection between SQL Server Managed Instance and Blob Storage. I would like to know if this is possible.


Solution

  • Version can be similar for both Azure SQL database and Azure SQL managed Instance.

    The result of SELECT SERVERPROPERTY('EngineEdition') is 5.

    This confirms that you use Azure SQL database and not a managed instance.

    enter image description here

    Image Reference: SERVERPROPERTY (Transact-SQL) | Microsoft Learn

    • External file format is not supported in Azure SQL database.
    • To copy data from blob storage to Azure SQL database, you use either bulk insert T-SQL command or Open rowset.

    Step:1 Create Database scoped credentials

    If Azure blob storage is not public, create database scoped credentials. Otherwise, skip to next step.

    CREATE DATABASE SCOPED CREDENTIAL <credential-name>
    WITH IDENTITY = 'SHARED ACCESS SIGNATURE',
    SECRET = '<sas-value>';
    

    Step:2 Create External data source

    CREATE EXTERNAL DATA SOURCE <datasource-name>
    WITH ( TYPE = BLOB_STORAGE,
           LOCATION = 'https://<storage-account>.blob.core.windows.net',
           CREDENTIAL= <credential-name>);
    

    If your blob is public, you can remove credential option in the above query.

    Step:3

    BULK INSERT product FROM '<path>' WITH 
    ( DATA_SOURCE = '<datasource-name>');
    

    or

    SELECT *
    FROM OPENROWSET(BULK '<path>', DATA_SOURCE = '<datasource-name>'
    ) as products;
    

    enter image description here

    Reference: Loading files from Azure Blob storage into Azure SQL Database | Microsoft Azure