Search code examples
pipelineazure-data-factoryazure-data-lakeu-sql

Script paths into Azure Data Factory DataLakeAnalytics u-sql pipeline


I'm trying to publish a data factory solution with this ADF DataLakeAnalyticsU-SQL pipeline activity following the azure step by step doc (https://learn.microsoft.com/en-us/azure/data-factory/data-factory-usql-activity).

 {
    "type": "DataLakeAnalyticsU-SQL",
    "typeProperties": {
      "scriptPath": "\\scripts\\111_risk_index.usql",
      "scriptLinkedService": "PremiumAzureDataLakeStoreLinkedService",
      "degreeOfParallelism": 3,
      "priority": 100,
      "parameters": {
        "in": "/DF_INPUT/Consodata_Prelios_consegna_230617.txt",
        "out": "/DF_OUTPUT/111_Analytics.txt"
      }
    },
    "inputs": [
      {
        "name": "PremiumDataLakeStoreLocation"
      }
    ],
    "outputs": [
      {
        "name": "PremiumDataLakeStoreLocation"
      }
    ],

    "policy": {
      "timeout": "06:00:00",
      "concurrency": 1,
      "executionPriorityOrder": "NewestFirst",
      "retry": 1
    },
    "scheduler": {
      "frequency": "Minute",
      "interval": 15
    },
    "name": "ConsodataFilesProcessing",
    "linkedServiceName": "PremiumAzureDataLakeAnalyticsLinkedService"
  }

During publishing got this error:

25/07/2017 18:51:59- Publishing Project 'Premium.DataFactory'....
25/07/2017 18:51:59- Validating 6 json files
25/07/2017 18:52:15- Publishing Project 'Premium.DataFactory' to Data 
Factory 'premium-df'
25/07/2017 18:52:15- Value cannot be null.
Parameter name: value

Trying to figure up what could be wrong with the project it came up that the issues reside into the activity options "typeProperties" as shown above, specifically for scriptPath and scriptLinkedService attributes. The doc says:

scriptPath: Path to folder that contains the U-SQL script. Name of the file 
is case-sensitive.  
scriptLinkedService: Linked service that links the storage that contains the 
script to the data factory

Publishing the project without them (using hard-coded script) it will complete successfully. The problem is that I can't either figure out what exactly put into them. I tried with several combinations paths. The only thing I know is that the script file must be referenced locally into the solution as a dependency.


Solution

  • The script linked service needs to be Blob Storage, not Data Lake Storage.

    Ignore the publishing error, its misleading.

    Have a linked service in your solution to an Azure Storage Account, referred to in the 'scriptLinkedService' attribute. Then in the 'scriptPath' attribute reference the blob container + path.

    For example:

    "typeProperties": {
      "scriptPath": "datafactorysupportingfiles/CreateDimensions - Daily.usql",
      "scriptLinkedService": "BlobStore",
      "degreeOfParallelism": 2,
      "priority": 7
    },
    

    Hope this helps.

    Ps. Double check for case sensitivity on attribute names. It can also throw unhelpful errors.