I've attempted to set up a simple webhook execution in Azure Data Factory (v2), calling a simple (parameter-less) webhook for an Azure Automation Runbook I set up.
From the Azure Portal, I can see that the webhook is being executed and my runbook is being run, so far so good. The runbook is (currently) returning an error within 1 minute of execution - but that's fine, I also want to test failure scenarios.
Problem: Data Factory doesn't seem to be 'seeing' the error result and spins until the timeout (10 minutes) elapses. When I kick off a debug run of the pipeline, I get the same - a timeout and no error result.
Update: I've fixed the runbook and it's now completing successfully, but Data Factory is still timing out and is not seeing the success response either.
Here is a screenshot of the setup:
And here is the portal confirming that the webhook is being run by azure data factory, and is completing in under a minute:
WEBHOOKDATA JSON is:
{"WebhookName":"Start CAMS VM","RequestBody":"{\r\n \"callBackUri\": \"https://dpeastus.svc.datafactory.azure.com/dataplane/workflow/callback/f7c...df2?callbackUrl=AAEAFF...0927&shouldReportToMonitoring=True&activityType=WebHook\"\r\n}","RequestHeader":{"Connection":"Keep-Alive","Expect":"100-continue","Host":"eab...ddc.webhook.eus2.azure-automation.net","x-ms-request-id":"7b4...2eb"}}
So as far as I can tell, things should be in place to pick up on the result (success of failure). Hopefully someone who's done this before knows what I'm missing.
Thanks!
I had assumed that Azure would automatically notify the ADF "callBackUri" with the result once the runbook completed or errored out (since they take care of 99% of the scaffolding without requiring a line of code).
It turns out that is not the case, and anyone wishing to execute a runbook from ADF will have to manually extract the callBackUri from the Webhookdata input parameter, and POST the result to it when done.
I haven't nailed this down yet, since the Microsoft tutorial sites I've found have a bad habit of taking screenshots of the code that does this rather than providing the code itself:
I guess I'll come back and edit this once I have it figured out.
EDIT I ended up implementing this by leaving my original Webhook untouched, and creating a "wrapper"/helper/utility Runbook that will execute an arbitrary webhook, and relay its status to ADF once it's complete.
Here's the full code I ended up with, in case it helps someone else. It's meant to be generic:
Setup / Helper Functions
param
(
[Parameter (Mandatory = $false)]
[object] $WebhookData
)
Import-Module -Name AzureRM.resources
Import-Module -Name AzureRM.automation
# Helper function for getting the current running Automation Account Job
# Inspired heavily by: https://github.com/azureautomation/runbooks/blob/master/Utility/ARM/Find-WhoAmI
<#
Queries the automation accounts in the subscription to find the automation account, runbook and resource group that the job is running in.
AUTHOR: Azure/OMS Automation Team
#>
Function Find-WhoAmI {
[CmdletBinding()]
Param()
Begin { Write-Verbose ("Entering {0}." -f $MyInvocation.MyCommand) }
Process {
# Authenticate
$ServicePrincipalConnection = Get-AutomationConnection -Name "AzureRunAsConnection"
Add-AzureRmAccount `
-ServicePrincipal `
-TenantId $ServicePrincipalConnection.TenantId `
-ApplicationId $ServicePrincipalConnection.ApplicationId `
-CertificateThumbprint $ServicePrincipalConnection.CertificateThumbprint | Write-Verbose
Select-AzureRmSubscription -SubscriptionId $ServicePrincipalConnection.SubscriptionID | Write-Verbose
# Search all accessible automation accounts for the current job
$AutomationResource = Get-AzureRmResource -ResourceType Microsoft.Automation/AutomationAccounts
$SelfId = $PSPrivateMetadata.JobId.Guid
foreach ($Automation in $AutomationResource) {
$Job = Get-AzureRmAutomationJob -ResourceGroupName $Automation.ResourceGroupName -AutomationAccountName $Automation.Name -Id $SelfId -ErrorAction SilentlyContinue
if (!([string]::IsNullOrEmpty($Job))) {
return $Job
}
Write-Error "Could not find the current running job with id $SelfId"
}
}
End { Write-Verbose ("Exiting {0}." -f $MyInvocation.MyCommand) }
}
Function Get-TimeStamp {
return "[{0:yyyy-MM-dd} {0:HH:mm:ss}]" -f (Get-Date)
}
My Code
### EXPECTED USAGE ###
# 1. Set up a webhook invocation in Azure data factory with a link to this Runbook's webhook
# 2. In ADF - ensure the body contains { "WrappedWebhook": "<your url here>" }
# This should be the URL for another webhook.
# LIMITATIONS:
# - Currently, relaying parameters and authentication credentials is not supported,
# so the wrapped webhook should require no additional authentication or parameters.
# - Currently, the callback to Azure data factory does not support authentication,
# so ensure ADF is configured to require no authentication for its callback URL (the default behaviour)
# If ADF executed this runbook via Webhook, it should have provided a WebhookData with a request body.
if (-Not $WebhookData) {
Write-Error "Runbook was not invoked with WebhookData. Args were: $args"
exit 0
}
if (-Not $WebhookData.RequestBody) {
Write-Error "WebhookData did not contain a ""RequestBody"" property. Data was: $WebhookData"
exit 0
}
$parameters = (ConvertFrom-Json -InputObject $WebhookData.RequestBody)
# And this data should contain a JSON body containing a 'callBackUri' property.
if (-Not $parameters.callBackUri) {
Write-Error 'WebhookData was missing the expected "callBackUri" property (which Azure Data Factory should provide automatically)'
exit 0
}
$callbackuri = $parameters.callBackUri
# Check for the "WRAPPEDWEBHOOK" parameter (which should be set up by the user in ADF)
$WrappedWebhook = $parameters.WRAPPEDWEBHOOK
if (-Not $WrappedWebhook) {
$ErrorMessage = 'WebhookData was missing the expected "WRAPPEDWEBHOOK" peoperty (which the user should have added to the body via ADF)'
Write-Error $ErrorMessage
}
else
{
# Now invoke the actual runbook desired
Write-Output "$(Get-TimeStamp) Invoking Webhook Request at: $WrappedWebhook"
try {
$OutputMessage = Invoke-WebRequest -Uri $WrappedWebhook -UseBasicParsing -Method POST
} catch {
$ErrorMessage = ("An error occurred while executing the wrapped webhook $WrappedWebhook - " + $_.Exception.Message)
Write-Error -Exception $_.Exception
}
# Output should be something like: {"JobIds":["<JobId>"]}
Write-Output "$(Get-TimeStamp) Response: $OutputMessage"
$JobList = (ConvertFrom-Json -InputObject $OutputMessage).JobIds
$JobId = $JobList[0]
$OutputMessage = "JobId: $JobId"
# Get details about the currently running job, and assume the webhook job is being run in the same resourcegroup/account
$Self = Find-WhoAmI
Write-Output "Current Job '$($Self.JobId)' is running in Group '$($Self.ResourceGroupName)' and Automation Account '$($Self.AutomationAccountName)'"
Write-Output "Checking for Job '$($JobId)' in same Group and Automation Account..."
# Monitor the job status, wait for completion.
# Check against a list of statuses that likely indicate an in-progress job
$InProgressStatuses = ('New', 'Queued', 'Activating', 'Starting', 'Running', 'Stopping')
# (from https://learn.microsoft.com/en-us/powershell/module/az.automation/get-azautomationjob?view=azps-4.1.0&viewFallbackFrom=azps-3.7.0)
do {
# 1 second between polling attempts so we don't get throttled
Start-Sleep -Seconds 1
try {
$Job = Get-AzureRmAutomationJob -Id $JobId -ResourceGroupName $Self.ResourceGroupName -AutomationAccountName $Self.AutomationAccountName
} catch {
$ErrorMessage = ("An error occurred polling the job $JobId for completion - " + $_.Exception.Message)
Write-Error -Exception $_.Exception
}
Write-Output "$(Get-TimeStamp) Polled job $JobId - current status: $($Job.Status)"
} while ($InProgressStatuses.Contains($Job.Status))
# Get the job outputs to relay to Azure Data Factory
$Outputs = Get-AzureRmAutomationJobOutput -Id $JobId -Stream "Any" -ResourceGroupName $Self.ResourceGroupName -AutomationAccountName $Self.AutomationAccountName
Write-Output "$(Get-TimeStamp) Outputs from job: $($Outputs | ConvertTo-Json -Compress)"
$OutputMessage = $Outputs.Summary
Write-Output "Summary ouput message: $($OutputMessage)"
}
# Now for the entire purpose of this runbook - relay the response to the callback uri.
# Prepare the success or error response as per specifications at https://learn.microsoft.com/en-us/azure/data-factory/control-flow-webhook-activity#additional-notes
if ($ErrorMessage) {
$OutputJson = @"
{
"output": { "message": "$ErrorMessage" },
"statusCode": 500,
"error": {
"ErrorCode": "Error",
"Message": "$ErrorMessage"
}
}
"@
} else {
$OutputJson = @"
{
"output": { "message": "$OutputMessage" },
"statusCode": 200
}
"@
}
Write-Output "Prepared ADF callback body: $OutputJson"
# Post the response to the callback URL provided
$callbackResponse = Invoke-WebRequest -Uri $callbackuri -UseBasicParsing -Method POST -ContentType "application/json" -Body $OutputJson
Write-Output "Response was relayed to $callbackuri"
Write-Output ("ADF replied with the response: " + ($callbackResponse | ConvertTo-Json -Compress))
At a high-level, steps I've taken are to: