Search code examples
azure-synapseazure-synapse-analyticsazure-synapse-pipeline

Dynamic mapping in Azure Synapse Copy Data Activity


We're currently facing an issue while ingesting data from SQL Server into Avro files. While most of the data types are correctly identified (e.g., strings as strings, integers as integers), our date fields are being mapped as strings instead of dates. For instance, a date like "2012-10-01 14:58:45.8830000" is being treated as a string.

To resolve this, we're exploring the option of dynamically mapping the schema. When we individually pull in tables, the schema inference correctly identifies the date fields. However, since our ingestion process involves pulling in more than one table (its in a for each loop), we want to try dynamically adjust the mapping to accommodate for all 10 tables being pulled in.

We've attempted to set up the dynamic mapping as per our understanding, but it doesn't seem to be working as expected. Any insights or assistance on how to properly configure the dynamic schema mapping would be greatly appreciated. Thank you!

Our attempt at dynamic mapping Our attempt at dynamic mapping

Example of correct mapping Example of correct mapping


Solution

  • As you mentioned you want to implement the dynamically mapping the schema, I have tried the below approach: Step 1: Create a [dbo].[tbl_mappings]

    With the Below Columns:

    sourceFile sinkTableSchema sinkTableName jsonMapping

    enter image description here

    Below is the example for the Json Mapping:

    {
                "type": "TabularTranslator",
                "mappings": [
                    {
                        "source": {
                            "name": "empid",
                            "type": "String",
                            "physicalType": "String"
                        },
                        "sink": {
                            "name": "emp_id",
                            "type": "Int32",
                            "physicalType": "int"
                        }
                    },
                    {
                        "source": {
                            "name": "empname",
                            "type": "String",
                            "physicalType": "String"
                        },
                        "sink": {
                            "name": "emp_name",
                            "type": "String",
                            "physicalType": "varchar"
                        }
                    },
                    {
                        "source": {
                            "name": "gender",
                            "type": "String",
                            "physicalType": "String"
                        },
                        "sink": {
                            "name": "Gender",
                            "type": "String",
                            "physicalType": "varchar"
                        }
                    },
                    {
                        "source": {
                            "name": "depid",
                            "type": "String",
                            "physicalType": "String"
                        },
                        "sink": {
                            "name": "dep_id",
                            "type": "Int32",
                            "physicalType": "int"
                        }
                    }
                ]
    }
    

    Step 2:

    enter image description here

    enter image description here

    Use the below expression in the dynamic content:

    @activity('Lookup1').output.value
    

    enter image description here

    @item().jsonMapping
    

    As you mentioned you are facing an issue while ingesting data from SQL Server into Avro files with date column being treated as a string.

    Use DerivedColumn transformation to your pipeline