Search code examples
azureazure-sql-databaseazure-data-factory

Getting NULLs when exporting datetime column to SQL Server in Azure data flow


I am getting the latest file modified date and storing in the variable RefDateTime.

I need to populate this to my entire source file. So I have connected the source file and in the derived column, I have added like

toString(toTimestamp($LastWriteTime, 'yyyy-MM-dd\'T\'HH:mm:ss\'Z\''), 'yyyy-MM-dd HH:mm:ss')

My sink is SQL Server, after doing all necessary transformations, I have pushed that to my SQL Server table. I have used a the DATETIME datatype for this particular column in my SQL Server table.

But after the pipeline is run, I get all NULLs, instead of latest modified datetime. Am I doing anything wrong? I am attaching some screenshots here:

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here


Solution

    • I have tried to pass the last modified date of file using get metadata and pass it to a dataflow. When the dataflow inserts the data, nulls are being inserted. The following is the sink data preview where I can see the data successfully.

    enter image description here

    • But the sink table shows nulls. The following is an image for reference.

    enter image description here

    • The reason for this behavior is not clear. But using formatDateTime function before passing the string to dataflow (with minor changes) allowed to insert the data properly.

    • The following is the pipeline JSON for the demonstration.

    {
        "name": "p1",
        "properties": {
            "activities": [
                {
                    "name": "Get Metadata1",
                    "type": "GetMetadata",
                    "dependsOn": [],
                    "policy": {
                        "timeout": "0.12:00:00",
                        "retry": 0,
                        "retryIntervalInSeconds": 30,
                        "secureOutput": false,
                        "secureInput": false
                    },
                    "userProperties": [],
                    "typeProperties": {
                        "dataset": {
                            "referenceName": "DelimitedText1",
                            "type": "DatasetReference"
                        },
                        "fieldList": [
                            "lastModified"
                        ],
                        "storeSettings": {
                            "type": "AzureBlobFSReadSettings",
                            "enablePartitionDiscovery": false
                        },
                        "formatSettings": {
                            "type": "DelimitedTextReadSettings"
                        }
                    }
                },
                {
                    "name": "Data flow1",
                    "type": "ExecuteDataFlow",
                    "dependsOn": [
                        {
                            "activity": "Get Metadata1",
                            "dependencyConditions": [
                                "Succeeded"
                            ]
                        }
                    ],
                    "policy": {
                        "timeout": "0.12:00:00",
                        "retry": 0,
                        "retryIntervalInSeconds": 30,
                        "secureOutput": false,
                        "secureInput": false
                    },
                    "userProperties": [],
                    "typeProperties": {
                        "dataflow": {
                            "referenceName": "dataflow1",
                            "type": "DataFlowReference",
                            "parameters": {
                                "dttt": {
                                    "value": "'@{formatDateTime(activity('Get Metadata1').output.lastModified, 'yyyy-MM-dd HH:mm:ss')}'",
                                    "type": "Expression"
                                }
                            }
                        },
                        "compute": {
                            "coreCount": 8,
                            "computeType": "General"
                        },
                        "traceLevel": "Fine"
                    }
                },
                {
                    "name": "Script1",
                    "type": "Script",
                    "dependsOn": [
                        {
                            "activity": "Data flow1",
                            "dependencyConditions": [
                                "Succeeded"
                            ]
                        }
                    ],
                    "policy": {
                        "timeout": "0.12:00:00",
                        "retry": 0,
                        "retryIntervalInSeconds": 30,
                        "secureOutput": false,
                        "secureInput": false
                    },
                    "userProperties": [],
                    "linkedServiceName": {
                        "referenceName": "AzureSqlDatabase1",
                        "type": "LinkedServiceReference"
                    },
                    "typeProperties": {
                        "scripts": [
                            {
                                "type": "Query",
                                "text": "select * from t1"
                            }
                        ],
                        "scriptBlockExecutionTimeout": "02:00:00"
                    }
                }
            ],
            "variables": {
                "tp": {
                    "type": "String"
                }
            },
            "annotations": []
        }
    }
    

    enter image description here

    • The following is the dataflow JSON that I used.
    {
        "name": "dataflow1",
        "properties": {
            "type": "MappingDataFlow",
            "typeProperties": {
                "sources": [
                    {
                        "dataset": {
                            "referenceName": "DelimitedText1",
                            "type": "DatasetReference"
                        },
                        "name": "source1"
                    }
                ],
                "sinks": [
                    {
                        "dataset": {
                            "referenceName": "AzureSqlTable1",
                            "type": "DatasetReference"
                        },
                        "name": "sink1"
                    }
                ],
                "transformations": [
                    {
                        "name": "derivedColumn1"
                    },
                    {
                        "name": "cast1"
                    }
                ],
                "scriptLines": [
                    "parameters{",
                    "     dttt as string ('2023-05-18T07:11:07Z')",
                    "}",
                    "source(output(",
                    "          id as short",
                    "     ),",
                    "     allowSchemaDrift: true,",
                    "     validateSchema: false,",
                    "     ignoreNoFilesFound: false) ~> source1",
                    "source1 derive(dt = $dttt) ~> derivedColumn1",
                    "derivedColumn1 cast(output(",
                    "          dt as timestamp 'yyyy-MM-dd HH:mm:ss'",
                    "     ),",
                    "     errors: true) ~> cast1",
                    "cast1 sink(allowSchemaDrift: true,",
                    "     validateSchema: false,",
                    "     input(",
                    "          id as integer,",
                    "          animal as string",
                    "     ),",
                    "     deletable:false,",
                    "     insertable:true,",
                    "     updateable:false,",
                    "     upsertable:false,",
                    "     format: 'table',",
                    "     skipDuplicateMapInputs: true,",
                    "     skipDuplicateMapOutputs: true,",
                    "     errorHandlingOption: 'stopOnFirstError') ~> sink1"
                ]
            }
        }
    }
    

    enter image description here

    • You can see that the data is being successfully inserted. The following is an output image reference for the same.