Search code examples
azureazure-monitoringazure-vm-scale-set

Terminate Azure Agent of Scale Set after a period of time


I am currently trying to find a way to monitor agents in Azure that were scaled by Azure Scale-Sets. To avoid costs I want to kill every agent that has been running a job for longer than 1 hour. Is there a solution from Azure that would help me in this case?

I was thinking about using alerts and Azure functions but I dont know which alert signal to use since they all only consider metrics from the vmss and not a single agent.

Thank you


Solution

  • Terminate Azure Agent of Scale Set after a period of time

    You can use an Azure Automation runbook to check and kill the Agent Service every hour. To create a runbook and schedule the job, you can follow the Stack link that I answered.

    Here is thePowerShell script can monitor and terminate the Azure Monitor Agent Service if it runs longer than an hour.

        az login --identity --username "ed65fffjfnjff fhc01dcf5"
        
        # Define the script content
        $scriptContent = @"
        # Get the Azure Monitor Agent process
        \$process = Get-Process "AzureMonitorAgentService" -ErrorAction SilentlyContinue
        
        # Check if the process exists
        if (\$process) {
            # Get the process start time
            \$processStartTime = \$process.StartTime
            # Get the current time
            \$currentTime = Get-Date
        
            # Calculate the elapsed time since the process started
            \$elapsedTime = New-TimeSpan -Start \$processStartTime -End \$currentTime
        
            # Check if the elapsed time is greater than 1 hour
            if (\$elapsedTime.TotalHours -gt 1) {
                # Stop the process
                Stop-Process -Name AzureMonitorAgentService -Force
        
                Stop-Service -Name Azure Monitor Agent -Force -Confirm
        
                Write-Output "Azure Monitor Agent Service has been stopped."
            } else {
                Write-Output "Azure monitor Service has not been running for more than 1 hour."
                Write-Output "Process Start Time: " + \$process.StartTime
                Write-Output "Current Time: " + \$currentTime
                Write-Output "Elapsed Time: " + \$elapsedTime
            }
        } else {
            Write-Output "Azure Monitor Agent process is not running."
        }
        "@
        
        # Define the resource group and VMSS name
        $resourceGroup = "venkat-vmss_group"
        $vmssName = "venkat-vmss"
        
        # Get the list of instance IDs in the VMSS
        $instanceIds = az vmss list-instances --resource-group $resourceGroup --name $vmssName --query "[].instanceId" -o tsv | ConvertFrom-Json
        
        # Loop through each instance ID and execute the script
        foreach ($instanceId in $instanceIds) {
            Write-Host "Running script on VMSS instance $instanceId ..."
            az vmss run-command invoke `
                --resource-group $resourceGroup `
                --name $vmssName `
                --instance-id $instanceId `
                --command-id RunPowerShellScript `
                --scripts "$scriptContent"
        }
        
        Write-Host "Script execution completed for all VMSS instances."
    

    To schedule the Job and check the Agent service status hourly, you can follow the steps shown in the snapshot below.

    enter image description here

    Output: enter image description here