Search code examples
azureazure-devopsazure-resource-managerazure-bicepinfrastructure-as-code

Guidance on long running Azure Deployments


We are currently developing a set of Bicep libraries as part of our infrastructure as code overally strategy and while deployments that take a few minutes to run are no issue we have to consider a slightly different approach to long running tasks.

Take, for example, SQL Managed Instance which can take 6+ hours to deploy into a virtual network. Application Gateway which can take a couple of hours and App Service Environment that can take 4+ hours. Is there a recommended approach to segragate these long running tasks in ARM or Bicep? Does Bicep have a "don't wait for result" option when a template is submitted?

One appraoch that we had considered is deploying the large infrastructure pieces in a DevOps pipeline that is only run when we need to make a change to the architecture and the application centric (App Services, App Config, DBs etc) are ran via a different DevOps pipeline.

I can't see any guidance on the Microsoft docs (ARM or Bicep) website so would appreciate some input.


Solution

  • The best practice with Bicep is to leverage it's dependency graph. If your application/workload needs 10 resources, then let Bicep manage the dependencies and it'll generally optimise the deployment sequence and sharing the right inputs/outputs to the connected services. You'd normally write this by calling a main.bicep which the subsequently references a bunch of other bicep files (modules) to orchestrate the deployment.

    That said, when you're dealing with services that take a significant amount of time to deploy (as you've listed), if you use the standard method as I've described then you're;

    1. Exacerbating the "developer inner loop" by front-loading all of the services to be tightly connected.
    2. Adding more complexity to the bicep module structure, you need to make sure that ASE creation doesn't wait for a connection string from your SQL MI deployment.

    "Bicep" itself doesn't care about the result, you can even pass a --no-wait to the az deployment group create command. The ARM control plane however does care, the effects of a failed deployment can necessitate a redeploy.

    Given the services you talk about, I'd consider them infrastructure fundamentals and would put them in another pipeline stage that would have more conditionality around when it ran. Then you'd be able to deploy the child resources into them in another stage, this is one of the benefits of so many of the Azure Resource Providers having first class resource types for child resources (eg. SQL Database as a child of SQL Server, etc)