Search code examples
azure-devopspull-requestazure-devops-server-2019

Azure DevOps - Branch Policies are resulting in multiple builds running during Pull Requests


Our repositories has folders, the code in the folder are sometimes dependent on code in other folders, but only in one direction. For way of explanation:

C depends on B

B depends on A

We have 3 builds required on our Pull Request policy for master:

We have a build (BuildC) that builds ONLY folder C

We have a build (BuildB) that builds B and C

We have a build (BuildA) that builds A, B, and C

The policy specifies:

Changes in folder C require BuildC

Changes in Folder B require BuildB

Changes in Folder A require BuildA

Desired effect: Depending on the case, I want the Pull Request to require ONLY ONE of the three builds. Here are the cases:

BuildA - Should run when there are changes in folder A (even if there are changes elsewhere)

BuildB - Should run when there are changes in B (and/or C) but NOT IN A. If there are changes in folder A, this build should NOT run

BuildC - Should run when the only changes are in folder C... if changes exist in folder A and/or B in addition to C... this build should not run.

What actually happens is that if you change something in folder A and C, two builds run: BuildA and BuildC... and if the changes in folder C depend on folder A, then BuildC build fails. In any case, the run of buildC is a waste.

Is there a way to have Azure DevOps queue only 1 build... but the best one. So in our example case, BuildA will run but not BuildC... but if the changes were only in Folder C, it would run Build C?


Solution

  • There is no way to accomplish what you want using build triggers or policies. There is no "Don't build when there are changes in folder X". There are a few options though, but they require a bit of rethinking:

    Option 1: Use jobs & conditions

    • Create a single Pipeline with a build stage and 4 jobs.
    • The first job uses a commandline tool to detect which projects need to be rebuilt and sets an output variable
    • The other 3 jobs depend on the first job and have a condition set on them to only trigger when a variable (set in the first job) has a certain value.

    That way you can take complete control over the build order of all 3 projects.

    Option 2: Use an orchestration pipeline

    Option 3: Use Pipeline Artifacts

    Instead of building A+B+C in build C, download the results from A+B, then build C. This will require uploading pipeline artefacts at the end of each job and for each subsequent job to do an incremental build by downloading these artifacts and thereby skipping the build process.

    You could even download the "last successful" results in case you want to skip building the code.

    Option 4: Use NuGet

    Instead of pipeline artifacts, use nuget packages to publish the output from Build A and consume them in Build B. Or even, publish A in job A and consume it from job B in the same build definition.

    Option 5: Rely on incremental builds

    If you're running on a self-hosted agent, you can turn off the "Clean" option for your pipeline, in care the same agent has built your build before, if will simply re-use the build output of the previous run, in case none of the input files have changed (and you haven't made any incorrect msbuild customizations). It will essentially skip building A if msbuild can calculate it won't need to build A.


    The advantage of a single build with multiple jobs is that you can specify the order of the jobs A, B, C and can control what happens in each job. The big disadvantage is that each job adds the overhead of fetching sources or downloading artifacts. You can optimize that a bit by clearly setting the wildcards for what pieces you want to publish and to restore.

    If you don't need the sources in subsequent stages (and aren't using YAML pipelines), you can use my Don't Sync Sources task (even with Git) to skip the sync step, allowing you to take control over exactly what happens in each job.

    Many of these options rely on you figuring out which projects contain changed files since the last successful build. You can use the git or tfvc commandline utilities to tell you which files were changed, but creating the perfect script may be a bit harder when you have build batching turned on, in which case multiple changes will trigger your build at once, so you can't just rely on the "latest changes". In that case you may need to ure the REST API to ask Azure DevOps al the commitIds or all changeset numbers associated with this build to do the proper diff to calculate which projects contain changes.

    Long-term, relying on a single build with multiple jobs or nuget packages is likely going to be easier to maintain.