Search code examples
azuresnowflake-cloud-data-platformazure-data-factoryazure-data-lake-gen2

Copy Data from Azure Data Lake to SnowFlake without stage using Azure Data Factory


All the Azure Data Factory examples of copying data from Azure Data Lake Gen 2 to SnowFlake use a storage account as stage. If the stage is not configured (as shown in picture), I get this error in Data Factory even when my source is a csv file in Azure data lake - "Direct copying data to Snowflake is only supported when source dataset is DelimitedText, Parquet, JSON with Azure Blob Storage or Amazon S3 linked service, for other dataset or linked service, please enable staging". enter image description here

At the same time, SnowFlake documentation says the the external stage is optional. How can I copy data from Azure Data Lake to SnowFlake using Data Factory's Copy Data Activity without having an external storage account as stage? If staging storage is needed to make it work, we shouldn't say that data copy from Data Lake to SnowFlake is supported. It works only when, Data Lake data is is first copied in a storage blob and then to SnowFlake.


Solution

  • Though Snowflake supports blob storage, Data Lake storage Gen2, General purpose v1 & v2 storages, loading data into snowflake is supported- through blob storage only.

    The source linked service is Azure Blob storage with shared access signature authentication. If you want to directly copy data from Azure Data Lake Storage Gen2 in the following supported format, you can create an Azure Blob linked service with SAS authentication against your ADLS Gen2 account, to avoid using staged copy to Snowflake.

    Select Azure blob storage in linked service, provide SAS URI details of Azure data lake gen2 source file.

    Blob storage linked service with data lake gen2 file:

    enter image description here

    enter image description here

    enter image description here

    enter image description here

    enter image description here