Search code examples
azure-devopstimeout

Why is my Azure DevOps Migration timing out after several hours?


I have a long running Migration (don't ask) being run by an AzureDevOps Release pipeline. Specifically, it's an "Azure SQL Database deployment" activity, running a "SQL Script File" Deployment Type.

Despite having configured maximums in all the timeouts in the Invoke-Sql Additional Parameters settings, my migration is still timing out.

Specifically, I get:

We stopped hearing from agent Hosted Agent. Verify the agent machine is running and has a healthy network connection. Anything that terminates an agent process, starves it for CPU, or blocks its network access can cause this error.

So far it's timed out after:

  • 6:13:15
  • 6:13:18
  • 6:14:41
  • 6:10:19

So "after 6 and a bit hours". It's ~22,400 seconds, which doesn't seem like any obvious kind of number either :)

Why? And how do I fix it?


Solution

  • It turns out that AzureDevOps uses Hosting Agents, to execute each Task in a pipeline, and those Agents have innate lifetimes, independent from whatever task they're running.

    https://learn.microsoft.com/en-us/azure/devops/pipelines/troubleshooting/troubleshooting?view=azure-devops#job-time-out

    A pipeline may run for a long time and then fail due to job time-out. Job timeout closely depends on the agent being used. Free Microsoft hosted agents have a max timeout of 60 minutes per job for a private repository and 360 minutes for a public repository. To increase the max timeout for a job, you can opt for any of the following.

    • Buy a Microsoft hosted agent which will give you 360 minutes for all jobs, irrespective of the repository used
    • Use a self-hosted agent to rule out any timeout issues due to the agent Learn more about job timeout.

    So I'm hitting the "360 minute" limit (presumably they give you a little extra on top, so that no-one complains?).

    Solution is to use a self-hosted agent. (or make my Migration run in under 6 hours, of course)