Search code examples
azurewebhooksazure-data-factoryazure-automationazure-runbook

Azure Data Factory webhook execution times out instead of relaying errors


I've attempted to set up a simple webhook execution in Azure Data Factory (v2), calling a simple (parameter-less) webhook for an Azure Automation Runbook I set up.

From the Azure Portal, I can see that the webhook is being executed and my runbook is being run, so far so good. The runbook is (currently) returning an error within 1 minute of execution - but that's fine, I also want to test failure scenarios.

Problem: Data Factory doesn't seem to be 'seeing' the error result and spins until the timeout (10 minutes) elapses. When I kick off a debug run of the pipeline, I get the same - a timeout and no error result.

Update: I've fixed the runbook and it's now completing successfully, but Data Factory is still timing out and is not seeing the success response either.

Here is a screenshot of the setup:

enter image description here

And here is the portal confirming that the webhook is being run by azure data factory, and is completing in under a minute:

enter image description here

WEBHOOKDATA JSON is:

{"WebhookName":"Start CAMS VM","RequestBody":"{\r\n \"callBackUri\": \"https://dpeastus.svc.datafactory.azure.com/dataplane/workflow/callback/f7c...df2?callbackUrl=AAEAFF...0927&shouldReportToMonitoring=True&activityType=WebHook\"\r\n}","RequestHeader":{"Connection":"Keep-Alive","Expect":"100-continue","Host":"eab...ddc.webhook.eus2.azure-automation.net","x-ms-request-id":"7b4...2eb"}}

So as far as I can tell, things should be in place to pick up on the result (success of failure). Hopefully someone who's done this before knows what I'm missing.

Thanks!


Solution

  • I had assumed that Azure would automatically notify the ADF "callBackUri" with the result once the runbook completed or errored out (since they take care of 99% of the scaffolding without requiring a line of code).

    It turns out that is not the case, and anyone wishing to execute a runbook from ADF will have to manually extract the callBackUri from the Webhookdata input parameter, and POST the result to it when done.

    I haven't nailed this down yet, since the Microsoft tutorial sites I've found have a bad habit of taking screenshots of the code that does this rather than providing the code itself:

    enter image description here

    I guess I'll come back and edit this once I have it figured out.


    EDIT I ended up implementing this by leaving my original Webhook untouched, and creating a "wrapper"/helper/utility Runbook that will execute an arbitrary webhook, and relay its status to ADF once it's complete.

    Here's the full code I ended up with, in case it helps someone else. It's meant to be generic:

    Setup / Helper Functions

    param
    (
        [Parameter (Mandatory = $false)]
        [object] $WebhookData
    )
    
    Import-Module -Name AzureRM.resources
    Import-Module -Name AzureRM.automation
    
    # Helper function for getting the current running Automation Account Job
    # Inspired heavily by: https://github.com/azureautomation/runbooks/blob/master/Utility/ARM/Find-WhoAmI
    <#
        Queries the automation accounts in the subscription to find the automation account, runbook and resource group that the job is running in.
        AUTHOR: Azure/OMS Automation Team
    #>
    Function Find-WhoAmI {
        [CmdletBinding()]
        Param()
        Begin { Write-Verbose ("Entering {0}." -f $MyInvocation.MyCommand) }
        Process {
            # Authenticate
            $ServicePrincipalConnection = Get-AutomationConnection -Name "AzureRunAsConnection"
            Add-AzureRmAccount `
                -ServicePrincipal `
                -TenantId $ServicePrincipalConnection.TenantId `
                -ApplicationId $ServicePrincipalConnection.ApplicationId `
                -CertificateThumbprint $ServicePrincipalConnection.CertificateThumbprint | Write-Verbose
            Select-AzureRmSubscription -SubscriptionId $ServicePrincipalConnection.SubscriptionID | Write-Verbose 
            # Search all accessible automation accounts for the current job
            $AutomationResource = Get-AzureRmResource -ResourceType Microsoft.Automation/AutomationAccounts
            $SelfId = $PSPrivateMetadata.JobId.Guid
            foreach ($Automation in $AutomationResource) {
                $Job = Get-AzureRmAutomationJob -ResourceGroupName $Automation.ResourceGroupName -AutomationAccountName $Automation.Name -Id $SelfId -ErrorAction SilentlyContinue
                if (!([string]::IsNullOrEmpty($Job))) {
                    return $Job
                }
                Write-Error "Could not find the current running job with id $SelfId"
            }
        }
        End { Write-Verbose ("Exiting {0}." -f $MyInvocation.MyCommand) }
    }
    
    Function Get-TimeStamp {    
        return "[{0:yyyy-MM-dd} {0:HH:mm:ss}]" -f (Get-Date)    
    }
    

    My Code

    
    ### EXPECTED USAGE ###
    # 1. Set up a webhook invocation in Azure data factory with a link to this Runbook's webhook
    # 2. In ADF - ensure the body contains { "WrappedWebhook": "<your url here>" }
    #    This should be the URL for another webhook.
    # LIMITATIONS:
    # - Currently, relaying parameters and authentication credentials is not supported,
    #    so the wrapped webhook should require no additional authentication or parameters.
    # - Currently, the callback to Azure data factory does not support authentication,
    #    so ensure ADF is configured to require no authentication for its callback URL (the default behaviour)
    
    # If ADF executed this runbook via Webhook, it should have provided a WebhookData with a request body.
    if (-Not $WebhookData) {
        Write-Error "Runbook was not invoked with WebhookData. Args were: $args"
        exit 0
    }
    if (-Not $WebhookData.RequestBody) {
        Write-Error "WebhookData did not contain a ""RequestBody"" property. Data was: $WebhookData"
        exit 0
    }
    $parameters = (ConvertFrom-Json -InputObject $WebhookData.RequestBody)
    # And this data should contain a JSON body containing a 'callBackUri' property.
    if (-Not $parameters.callBackUri) {
        Write-Error 'WebhookData was missing the expected "callBackUri" property (which Azure Data Factory should provide automatically)'
        exit 0
    }
    $callbackuri = $parameters.callBackUri
    
    # Check for the "WRAPPEDWEBHOOK" parameter (which should be set up by the user in ADF)
    $WrappedWebhook = $parameters.WRAPPEDWEBHOOK
    if (-Not $WrappedWebhook) {
        $ErrorMessage = 'WebhookData was missing the expected "WRAPPEDWEBHOOK" peoperty (which the user should have added to the body via ADF)'
        Write-Error $ErrorMessage
    }
    else
    {
        # Now invoke the actual runbook desired
        Write-Output "$(Get-TimeStamp) Invoking Webhook Request at: $WrappedWebhook"
        try {    
            $OutputMessage = Invoke-WebRequest -Uri $WrappedWebhook -UseBasicParsing -Method POST
        } catch {
            $ErrorMessage = ("An error occurred while executing the wrapped webhook $WrappedWebhook - " + $_.Exception.Message)
            Write-Error -Exception $_.Exception
        }
        # Output should be something like: {"JobIds":["<JobId>"]}
        Write-Output "$(Get-TimeStamp) Response: $OutputMessage"    
        $JobList = (ConvertFrom-Json -InputObject $OutputMessage).JobIds
        $JobId = $JobList[0]
        $OutputMessage = "JobId: $JobId"         
    
        # Get details about the currently running job, and assume the webhook job is being run in the same resourcegroup/account
        $Self = Find-WhoAmI
        Write-Output "Current Job '$($Self.JobId)' is running in Group '$($Self.ResourceGroupName)' and Automation Account '$($Self.AutomationAccountName)'"
        Write-Output "Checking for Job '$($JobId)' in same Group and Automation Account..."
    
        # Monitor the job status, wait for completion.
        # Check against a list of statuses that likely indicate an in-progress job
        $InProgressStatuses = ('New', 'Queued', 'Activating', 'Starting', 'Running', 'Stopping')
        # (from https://learn.microsoft.com/en-us/powershell/module/az.automation/get-azautomationjob?view=azps-4.1.0&viewFallbackFrom=azps-3.7.0)  
        do {
            # 1 second between polling attempts so we don't get throttled
            Start-Sleep -Seconds 1
            try { 
                $Job = Get-AzureRmAutomationJob -Id $JobId -ResourceGroupName $Self.ResourceGroupName -AutomationAccountName $Self.AutomationAccountName
            } catch {
                $ErrorMessage = ("An error occurred polling the job $JobId for completion - " + $_.Exception.Message)
                Write-Error -Exception $_.Exception
            }
            Write-Output "$(Get-TimeStamp) Polled job $JobId - current status: $($Job.Status)"
        } while ($InProgressStatuses.Contains($Job.Status))
    
        # Get the job outputs to relay to Azure Data Factory
        $Outputs = Get-AzureRmAutomationJobOutput -Id $JobId -Stream "Any" -ResourceGroupName $Self.ResourceGroupName -AutomationAccountName $Self.AutomationAccountName
        Write-Output "$(Get-TimeStamp) Outputs from job: $($Outputs | ConvertTo-Json -Compress)"
        $OutputMessage = $Outputs.Summary
        Write-Output "Summary ouput message: $($OutputMessage)"
    }
    
    # Now for the entire purpose of this runbook - relay the response to the callback uri.
    # Prepare the success or error response as per specifications at https://learn.microsoft.com/en-us/azure/data-factory/control-flow-webhook-activity#additional-notes
    if ($ErrorMessage) {
        $OutputJson = @"
    {
        "output": { "message": "$ErrorMessage" },
        "statusCode": 500,
        "error": {
            "ErrorCode": "Error",
            "Message": "$ErrorMessage"
        }
    }
    "@
    } else {
        $OutputJson = @"
    {
        "output": { "message": "$OutputMessage" },
        "statusCode": 200
    }
    "@
    }
    Write-Output "Prepared ADF callback body: $OutputJson"
    # Post the response to the callback URL provided
    $callbackResponse = Invoke-WebRequest -Uri $callbackuri -UseBasicParsing -Method POST -ContentType "application/json" -Body $OutputJson
    
    Write-Output "Response was relayed to $callbackuri"
    Write-Output ("ADF replied with the response: " + ($callbackResponse | ConvertTo-Json -Compress))
    

    At a high-level, steps I've taken are to:

    1. Execute the "main" Webhook - get back a "Job Id"
    2. Get the current running job's "context" (resource group and automation account info) so that I can poll the remote job.
    3. Poll the job until it is complete
    4. Put together either a "success" or "error" response message in the format that Azure Data Factory expects.
    5. Invoke the ADF callback.