Search code examples
azureazure-virtual-machineterraform-provider-azurecloud-init

Error handling in cloud init scripts within azurerm_linux_virtual_machine


I run a custom shell script when I deploy my virtual machine with terraform, which can throw errors.

My question is, how do you handle these errors, because regardless of the return code of the script, terraform always reports the deployment was successful, which leads to confusion when the VM does not what it’s supposed to do.

Here a snippet of the terraform file for context:

data "template_file" "setup_script" {
  count    = var.agent_count
  template = file("scripts/setup.sh")
  vars = {
    POOL_NAME        = var.pool_name
    AGENT            = "agent-${count.index}"
    ORGANIZATION_URL = var.organization_url
    TOKEN            = var.token
    TERRAFORM_VERSION = var.terraform_version
    VSTS_AGENT_VERSION = var.vsts_agent_version
  }
}

resource "azurerm_linux_virtual_machine" "vmachine" {
  count               = length(module.network.network_interfaces)
  name                = "agent-${count.index}"
  resource_group_name = azurerm_resource_group.deployment-agents.name
  location            = azurerm_resource_group.deployment-agents.location
  size                = "Standard_B1ms"
  admin_username      = "adminuser"

  network_interface_ids = [
    module.network.network_interfaces[count.index].id,
  ]

  admin_ssh_key {
    username   = "adminuser"
    public_key = var.ssh_public_key
  }

  os_disk {
    caching              = "ReadWrite"
    storage_account_type = "Standard_LRS"
  }

  source_image_reference {
    publisher = "Canonical"
    offer     = "UbuntuServer"
    sku       = "18.04-LTS"
    version   = "latest"
  }

  boot_diagnostics {
    storage_account_uri = azurerm_storage_account.boot.primary_blob_endpoint
  }
  custom_data = base64encode(data.template_file.setup_script.*.rendered[count.index])
}

And the setup.sh shell script:

# --- snip ----
apt-get install azure-cli
if [ $? -gt 0 ]; then
  echo "Cannot install azure cli!"
  exit 1
fi

# Test
exit 1

Thanks for the help.


Solution

  • Terraform only return the error about itself, not the script execute inside the VM. You can find the error message inside the VM.

    And to install the Azure CLI via the cloud-init with a shell script, you need to add #!/bin/bash at the beginning of the shell script, see the note:

    enter image description here

    And install the Azure CLI, I think there are more things you need to do than what you have tried, take a look at the steps that install the Azure CLI in Ubuntu. Or use the existing shell script here.