Search code examples
microsoft-dynamicsazure-data-lakeazure-data-lake-gen2azure-synapse-analytics

how to update data lake table schema to match data source using synapse analytics


So there's a data synapse link created to ingest data from ms dynamics to dataverse (data lake). However, the schema in data verse didn't have the pkeys of the data source and some data types are inconsistent. Now I'm trying to figure out the best approach on synching the schema of the two including the keys and the indexes if possible. Basically just adding the keys to the data lake database and update some data types.

Can I directly do this update in the lake database using Synapse Analytics Live? Will the update affect the old data?

Please note that another person set this all up and I'm trying to fix it.

I am not sure if this can also be done using Power Apps. I currently do not have admin access to the database and there is no dev environment yet for me to play it with.


Solution

    • If you make changes directly in storage, the definitions of your lake databases will become out of sync.

    Know More about Lake database

    • To add keys to the data lake database and update data types.

    You can create a Custom Table enter image description here

    Columns and Datatypes: enter image description here

    Establish Relationships: enter image description here

    Map Data:

    • Map data Allows us to map the table that got selected. And by taking some data I have already got in Lake.

    Know more abot Map Data in Azure Synapse Analytics

    enter image description here enter image description here

    enter image description here

    Once the file is loaded you will have to create a new mapping name and it will load a designer for the Column mapping with.

    Create Pipeline: enter image description here

    Create pipeline will create a folder and creates 2 Objects in it. One is the Parent Pipeline for the whole data mapping. And Parent Pipeline kicking off the child pipeline for the mapping dataflow.

    In the dataflow the data is pushed to sink in the lake database.