Search code examples
azureazure-data-factory

Azure Data Factory pipeline fails when run by trigger


I have an ADF pipeline that copies JSON files from blob storage to an on-premises fileserver, and the pipeline is triggered by listening to BlobCreated events. These blobs are text files that contain a list of files that the pipeline should copy.

It works fine when I start the pipeline by pressing "Debug" and provide the text file info by hand: enter image description here

But if I start it from "Trigger" -> "Trigger now", or by uploading the txt file to blob storage to test a real event, it fails with the error:

The property 'userid' in the payload cannot be null or empty.

This is how I start it via "Trigger now": enter image description here

I thought maybe I didn't publish my changes (because of that warning), but I ran our devops pipeline and after that it still wasn't working. I suspect that there might be an issue here somewhere, but I have no idea where it could be or how to start debugging it... I would try the Publish button but it's disabled, maybe I'll have to ask for authorized people to enable it for testing.

Error in pipeline runs:

enter image description here

Where is my configuration error? I'm so confused.

Linked Service JSON:

{
    "name": "LS_DFS_REDACTED",
    "properties": {
        "description": "some useful text here.",
        "annotations": [
            "KIPA Integration"
        ],
        "type": "FileServer",
        "typeProperties": {
            "host": "\\\\ad.redacted.com\\foo\\bar\\dev\\",
            "userId": "ASDF\\usr-dev-rw",
            "password": {
                "type": "AzureKeyVaultSecret",
                "store": {
                    "referenceName": "LS_kv",
                    "type": "LinkedServiceReference"
                },
                "secretName": "usr-dev-rw"
            }
        },
        "connectVia": {
            "referenceName": "IR-Integration-DataFactory-df",
            "type": "IntegrationRuntimeReference"
        }
    }
}

Trigger JSON:

{
    "name": "Redacted - Processed filelist created",
    "properties": {
        "description": "Triggered when a /kipa/adf/*_processedJsonFiles.txt blob is created, which contains list of files for pipeline to copy.",
        "annotations": [
            "KIPA Integration"
        ],
        "runtimeState": "Started",
        "pipelines": [
            {
                "pipelineReference": {
                    "referenceName": "STA-DFS_CopyProcessedJson",
                    "type": "PipelineReference"
                },
                "parameters": {
                    "triggerFolderPath": "@trigger().outputs.body.folderPath",
                    "triggerFileName": "@trigger().outputs.body.fileName"
                }
            }
        ],
        "type": "BlobEventsTrigger",
        "typeProperties": {
            "blobPathBeginsWith": "/kipa/blobs/adf",
            "blobPathEndsWith": "_processedJsonFiles.txt",
            "ignoreEmptyBlobs": true,
            "scope": "/subscriptions/{redacted}/resourceGroups/{redacted}/providers/Microsoft.Storage/storageAccounts/{redacted}",
            "events": [
                "Microsoft.Storage.BlobCreated"
            ]
        }
    }
}

Example input to activity (identical for failed and successful tasks, so issue not in input?)

{
    "source": {
        "type": "BinarySource",
        "storeSettings": {
            "type": "AzureBlobStorageReadSettings",
            "fileListPath": "kipa/adf/20241218105953_processedJsonFiles.txt",
            "deleteFilesAfterCompletion": false
        },
        "formatSettings": {
            "type": "BinaryReadSettings"
        }
    },
    "sink": {
        "type": "BinarySink",
        "storeSettings": {
            "type": "FileServerWriteSettings",
            "copyBehavior": "PreserveHierarchy"
        }
    },
    "enableStaging": false,
    "skipErrorFile": {
        "dataInconsistency": true
    },
    "validateDataConsistency": true,
    "logSettings": {
        "enableCopyActivityLog": true,
        "copyActivityLogSettings": {
            "logLevel": "Warning",
            "enableReliableLogging": false
        },
        "logLocationSettings": {
            "linkedServiceName": {
                "referenceName": "LS_REDACTED_STA",
                "type": "LinkedServiceReference"
            },
            "path": "kipa/adf/STA-DFS_CopyProcessedJson"
        }
    }
}

UPDATE:

I changed from using KeyVault to a regular password, and now the error has changed to:

ErrorCode=AzureBlobCredentialMissing,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Please provide either connectionString or sasUri or serviceEndpoint to connect to Blob.,Source=Microsoft.DataTransfer.ClientLibrary,'

Progress! I seem to have provided all the overrides correctly in my DevOps pipeline but I will investigate further...


Solution

  • The problem was in the CI/CD (Azure DevOps) pipeline indeed... I don't know exactly what the problem was, but I edited the YAML to match the official docs and triple checked my parameter overrides, and now it works... Could have been a silent error somewhere.