Search code examples
azureetlazure-synapse

Can SQL Server be used as a sink in Azure Synapse with schema drift?


I have a workload with a highly variable schema. I have successfully loaded it into ADLS in the CDM format and am able to access the data from my data flow and write back to a CSV.

When trying to write to a SQL database, I receive the following error:

enter image description here

I considered using an integration dataset, but that required me to hard-code the table name which does not fit my use case.

Is using a SQL database as a sink supported in data flows? Or do I need to use a pipeline?


Solution

  • Using a SQL database as a sink is supported in data flows The error message you received indicates that the sink transformation is unable to determine the schema of the target table. However please refer to the below documentation https://learn.microsoft.com/en-us/azure/data-factory/connector-azure-sql-database?tabs=data-factory#using-always-encrypted

    I have tried resolve this issue, use the "Mapping" tab in the sink transformation to map the columns in your data flow to the columns in your target table. This will allow the sink transformation to determine the schema of the target table and write the data to it. The table EMP is has an column encryption.

    enter image description here

    The column salary is encrypted.

    enter image description here

    As you can see the images of the dataflow from ADLS(Source) And SQL Database as (Sink)

    enter image description here

    Consider the Source settings for your Source:

    enter image description here

    Also from the above image you can even see that the Allow Schema drift check box is also enabled. In the source Side check for the mapping Under Projection Tab.

    enter image description here

    Consider in the Sink Settings:

    enter image description here

    In the Sink Mapping you can disable the Automappning and map your Source to sink mapping.

    enter image description here

    Use the Data preview option check the mapping is correc or not.

    Now create new pipeline add the Dataflow into the pipeline and trigger/Debug your pipeline to check for the pipeline ru results.

    enter image description here

    When I trying to use the Dataflow when the column is encrypted for example in my case I have encrypted the column SALARY as Deterministic and noticed when trying move the data to the SQL DATABASE. I noticed that the Datatype missmatch would be there so as per the requirement I have changed the datatype of the column SALARY.

    You still want to maintain the encrypted column(s) you can follow theses steps:

    • Decrypt the existing encrypted column values in the source database. Export the decrypted column values from the source database.

    • Export the decrypted column values from the source database.

    • Create the same table with the desired data type for the encrypted
      column in the target database.

    • Import the decrypted column values into the new table in the target
      database.

    • Encrypt the column values in the new table using the encryption
      function that you want to use in the target database.

    you still Have issues, you can try using a pipeline with a copy activity to write the data to the SQL database. Source:

    enter image description here

    Sink:

    enter image description here

    Use the import schema option or create a new mapping.

    enter image description here

    Publish your changes and Trigger/ Debug your pipeline for the datamovement to SQL database.