I'm interested in querying the pipeline consumption report that is available from the Data Factory monitor. Is there a table on Log Analytics or PowerShell cmdlet that would return this information? I checked the ADFv2 PowerShell module but couldn't find any. My goal is to aggregate the information available in this report to identify what are the most costly pipelines.
Thank you
Doing more research someone pointed me to a GitHub page where the product team posted a PowerShell script to find part of what I was looking for {1}. So I did some modifications to the script to have the output that I needed. With the output below I can extract the values from the MS calculator to get an estimated cost for each pipeline run. {2}
$startTime = "21/6/2021 7:00:00"
$endTime = "21/6/2021 10:00:00"
$adf = '<data factory name>'
$rg = '<resrouce group name>'
$outputObj = @()
$pipelineRuns = Get-AzDataFactoryV2PipelineRun -ResourceGroupName $rg -DataFactoryName $adf -LastUpdatedAfter $startTime -LastUpdatedBefore $endTime
# loop through all pipelines and child activities to return billable information
foreach ($pipelineRun in $pipelineRuns) {
$activtiyRuns = Get-AzDataFactoryV2ActivityRun -ResourceGroupName $rg -DataFactoryName $adf -pipelineRunId $pipelineRun.RunId -RunStartedAfter $startTime -RunStartedBefore $endTime
foreach ($activtiyRun in $activtiyRuns) {
if ($null -ne $activtiyRun.Output -and $null -ne $activtiyRun.Output.SelectToken("billingReference.billableDuration")) {
$obj = @()
$obj = $activtiyRun.Output.SelectToken("billingReference.billableDuration").ToString() | ConvertFrom-Json
$obj | Add-Member -MemberType NoteProperty -Name activityType -value $activtiyRun.Output.SelectToken("billingReference.activityType").ToString()
$obj | Add-Member -MemberType NoteProperty -Name pipelineName -value $pipelineRun.PipelineName
$obj | Add-Member -MemberType NoteProperty -Name activtiyRuns -value $activtiyRuns.Count
$outputObj += $obj
}
else {}
}
}
# output aggregated result set as table
$groupedObj = $outputObj | Group-Object -Property pipelineName, activityType, meterType
$groupedObj | ForEach-Object {
$value = $_.name -split ', '
New-Object psobject -Property @{
activityType = $value[1];
meterType = $value[2];
pipelineName = $value[0];
executionHours = [math]::Round(($_.Group | Measure-object -Property duration -sum).Sum, 4)
orchestrationActivityRuns = $groupedObj.group.activtiyRuns[0]
}
} | Sort-Object -Property meterType | Format-Table
Output sample:
Consumption report from the Data Factory monitor
reference: