Search code examples
azureterraform

Terraform Azure site recovery services pipeline timeout


I'm creating a recovery services vault with some replicated VMs for failover, however the pipeline I'm running fails for the azurerm_site_recovery_replicated_vm. I believe this is to do with the timeout set for the apply, I've tried adding the timeouts block for read, create, delete and update to 3 hours but it doesn't seem to make any difference. The pipeline fails around 35 minutes.

Error:

Error: waiting for site recovery to replicate vm: making Read request on site recovery replicated vm Replication Protected Item (Subscription: "XXX"
│ Resource Group Name: "rg-dr-ukwest"
│ Vault Name: "rsv-dr-ukwest"
│ Replication Fabric Name: "primary-fabric"
│ Replication Protection Container Name: "primary-protection-container"
│ Replication Protected Item Name: "VM1") : authorizing request: clientCredentialsToken: received HTTP status 401 with response: {"error":"invalid_client","error_description":"AADSTS700024: Client assertion is not within its valid time range. Current time: 2024-11-20T10:41:43.2413879Z, assertion valid from 2024-11-20T10:00:53.0000000Z, expiry time of assertion 2024-11-20T10:10:53.0000000Z. Review the documentation at https://learn.microsoft.com/entra/identity-platform/certificate-credentials . Trace ID: dfcd5ea6-29e6-432c-b200-ffe4ab621200 Correlation ID: 389677a3-178d-4904-9f23-0612402706d4 Timestamp: 2024-11-20 10:41:43Z","error_codes":[700024],"timestamp":"2024-11-20 10:41:43Z","trace_id":"dfcd5ea6-29e6-432c-b200-ffe4ab621200","correlation_id":"389677a3-178d-4904-9f23-0612402706d4","error_uri":"https://login.microsoftonline.com/error?code=700024"}
│ 
│   with module.rsv.azurerm_site_recovery_replicated_vm.windowsvm-replication,
│   on modules/rsv/rsv.tf line 234, in resource "azurerm_site_recovery_replicated_vm" "windowsvm-replication":
│  234: resource "azurerm_site_recovery_replicated_vm" "windowsvm-replication" {

Just wanted to know if there's any way of getting around the timeout to stop the pipeline failing?


Solution

  • Terraform Azure site recovery services pipeline timeout

    In general this blocker may caused due to several issues first best thing I can suggest can be to split the azurerm_site_recovery_replicated_vm resources into smaller, sequential apply stages to reduce operation time so that we can have the better understanding where the things go wrong.

    Going as per the error description we can find that there is an 401 error which in general occur because of authentication issue. As per the MSdoc the issue might be an invalid client secret due to expiry,

    try to create a new service principle secret with valid time range so that during long operation the expiry of the key doesnt effects the provision of terraform configuration.

    enter image description here

    refer:

    Client assertion is not within its valid time range error while trying to acquire token from Azure client App - Microsoft Q&A

    https://learn.microsoft.com/en-us/azure/site-recovery/azure-to-azure-troubleshoot-errors

    https://learn.microsoft.com/en-us/entra/identity-platform/configurable-token-lifetimes