Search code examples
azure-data-lakeu-sqlazure-data-factory

U-SQL Activity is not running on Azure Data Factory


I am using Azure Data Factory to transfer on-premises data to Azure Data Lake store. After copying the data I am running a U-SQL script on the uploaded file to convert it to new .csv file. My U-SQL job is running fine if run it from Visual Studio or Directly from Data Lake Analytics.

But If add and an activity in Azure Data Factory. After copying the data this U-SQL Script activity immediately fails. I tried many ways but unable to resolve the issues. It gives me the following error.

enter image description here

JSON Definition of my U-SQL Activity is

{
    "name": "Transform Data",
    "description": "This will transform work space data.",
    "type": "DataLakeAnalyticsU-SQL",
    "dependsOn": [
        {
            "activity": "Copy_workplace_groups_info_2018_03_19_09_32_csv",
            "dependencyConditions": [
                "Completed"
            ]
        }
    ],
    "policy": {
        "timeout": "7.00:00:00",
        "retry": 0,
        "retryIntervalInSeconds": 30,
        "secureOutput": false
    },
    "typeProperties": {
        "scriptPath": "Scripts/Script.usql",
        "scriptLinkedService": {
            "referenceName": "Destination_DataLakeStore_lc0",
            "type": "LinkedServiceReference"
        }
    },
    "linkedServiceName": {
        "referenceName": "AzureDataLakeAnalyticsForDF",
        "type": "LinkedServiceReference"
    }
}

JSON of entire pipeline is

{
    "name": "CopyPipeline_d26",
    "properties": {
        "activities": [
            {
                "name": "Copy_workplace_groups_info_2018_03_19_09_32_csv",
                "type": "Copy",
                "policy": {
                    "timeout": "7.00:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false
                },
                "typeProperties": {
                    "source": {
                        "type": "FileSystemSource",
                        "recursive": false
                    },
                    "sink": {
                        "type": "AzureDataLakeStoreSink",
                        "copyBehavior": "MergeFiles"
                    },
                    "enableStaging": false,
                    "cloudDataMovementUnits": 0,
                    "enableSkipIncompatibleRow": true
                },
                "inputs": [
                    {
                        "referenceName": "workplace_groups_info_2018_03_19_09_32_csv_i_lc0",
                        "type": "DatasetReference"
                    }
                ],
                "outputs": [
                    {
                        "referenceName": "workplace_groups_info_2018_03_19_09_32_csv_o_lc0",
                        "type": "DatasetReference"
                    }
                ]
            },
            {
                "name": "Transform Data",
                "description": "This will transform work space data.",
                "type": "DataLakeAnalyticsU-SQL",
                "dependsOn": [
                    {
                        "activity": "Copy_workplace_groups_info_2018_03_19_09_32_csv",
                        "dependencyConditions": [
                            "Completed"
                        ]
                    }
                ],
                "policy": {
                    "timeout": "7.00:00:00",
                    "retry": 0,
                    "retryIntervalInSeconds": 30,
                    "secureOutput": false
                },
                "typeProperties": {
                    "scriptPath": "Scripts/Script.usql",
                    "scriptLinkedService": {
                        "referenceName": "Destination_DataLakeStore_lc0",
                        "type": "LinkedServiceReference"
                    }
                },
                "linkedServiceName": {
                    "referenceName": "AzureDataLakeAnalyticsForDF",
                    "type": "LinkedServiceReference"
                }
            }
        ],
        "parameters": {
            "windowStart": {
                "type": "String"
            },
            "windowEnd": {
                "type": "String"
            }
        }
    }
}

Solution

  • I resolved the issue by creating a runtime using AppService. I followed the following steps.

    1. I created a WebApp in Active Directory.
    2. Assign Azure Data Lake Permission to that WebApp too.
    3. Create a public Key in that App and note it. It will never show again.
    4. Note Application Id of that WebApp.
    5. Open the Azure Data Lake Analytics and assign contributor role to created WebApp in Active Directory.
    6. Use Application Id as Service Principal Id and Public Key as Service Principal Key of the WebApp while creating run time.

    It works fine. :)