Search code examples
azure-devopsazure-devops-server-2020

Query builds failed because of timeout - Azure DevOps Server


In our dev environment we have lots of repos, lots of builds and lots of buildservers, and most of the time things work just like they should - however, we are seeing an increase in builds that fail because of timeouts.

These timeouts are not happening because we are getting close to the limit, but because something "gets stuck/blocked" in the pipeline and it stays on that step until timeout kills the build.

To better debug why that happens, we need to be able to query what builds fails because of this timeout, so we for instance can see, if it is a particular build server or agent that has this problem.

We can not find anything in the API that would give us the timeout error, but we can see that the UI is able to deduct it somehow:

Screenshot of timeout

So far we have narrowed it down to query all builds with completed status (through this API), but we get no completion reason, and buildtimes are never exact the same as the timeout of the build defintion, so "guessing" it from the execution plan will also be a bit shaky.

How can we filter our builds down to only the builds that have timed out?


Solution

  • We can use the below API to get details for a build.

    Note: do not add timelineId, we should list all info

    GET https://dev.azure.com/{organization}/{project}/_apis/build/builds/{buildId}/timeline?api-version=6.1-preview.2
    

    If the build is canceled because of the timeout setting, we can get the message: The job running on agent Hosted Agent ran longer than the maximum time of xxx minutes. For more information, see https://go.microsoft.com/fwlink/?linkid=2077134

    enter image description here

    By the way, we can use the API Builds - List to filter all failed build. if the build is canceled due to a timeout setting. the result is failed instead of cancel.