I am having an issue when trying to setup Virtual Nodes for Azure Kubernetes cluster using Terraform.
When I check the pod for the aci-connector-linux, I get the below error:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 41m (x50 over 4h26m) kubelet Container image "mcr.microsoft.com/oss/virtual-kubelet/virtual-kubelet:1.4.1" already present on machine
Warning BackOff 68s (x1222 over 4h26m) kubelet Back-off restarting failed container
I've also granted the System Assigned identity of the Azure Kubernetes Cluster the required contributor role using the documentation here - https://github.com/terraform-providers/terraform-provider-azurerm/blob/master/examples/kubernetes/aci_connector_linux/main.tf but I'm still getting CrashLoopBackOff status error.
I finally fixed it.
The issue was caused by the Outdated documentation for aci-connector-linux
here - https://github.com/terraform-providers/terraform-provider-azurerm/blob/master/examples/kubernetes/aci_connector_linux/main.tf which assigns the role to the Managed identity of the Azure Kubernetes cluster
Here's how I fixed it:
Azure Kubernetes Service creates Node resource group which is separate from the resource group for the Kubernetes Cluster. Within the Node resource group, AKS creates a Managed Identity for the aci-connector-linux
. The name of the Node resource group is usually MC_<KubernetesResourceGroupName_KubernetesServiceName-KubernetesResourceGroupLocation>
, so if your KubernetesResourceGroupName is MyResourceGroup
and if the KubernetesServiceName is my-test-cluster
and if the KubernetesResourceGroupLocation westeurope
, then the Node resource group will be MC_MyResourceGroup_my-test-cluster_westeurope
. You can view the resources in the Azure Portal under Resource Groups.
Next, you can view the root cause of the issue by viewing the logs of the aci-connector-linux
pod using the command:
kubectl logs aci-connector-linux-577bf54d75-qm9kl -n kube-system
And you will an output like this:
time="2022-06-29T15:23:38Z" level=fatal msg="error initializing provider azure: error setting up network profile: error while looking up subnet: api call to https://management.azure.com/subscriptions/0237fb7-7530-43ba-96ae-927yhfad80d1/resourcegroups/MyResourceGroup/providers/Microsoft.Network/virtualNetworks/my-vnet/subnets/k8s-aci-node-pool-subnet?api-version=2018-08-01: got HTTP response status code 403 error code "AuthorizationFailed": The client '560df3e9b-9f64-4faf-aa7c-6tdg779f81c7' with object id '560df3e9b-9f64-4faf-aa7c-6tdg779f81c7' does not have authorization to perform action 'Microsoft.Network/virtualNetworks/subnets/read' over scope '/subscriptions/0237fb7-7530-43ba-96ae-927yhfad80d1/resourcegroups/MyResourceGroup/providers/Microsoft.Network/virtualNetworks/my-vnet/subnets/k8s-aci-node-pool-subnet' or the scope is invalid. If access was recently granted, please refresh your credentials."
You can fix this in Terraform using the code below:
# Get subnet ID
data "azurerm_subnet" "k8s_aci" {
name = "k8s-aci-node-pool-uat-subnet"
virtual_network_name = "sparkle-uat-vnet"
resource_group_name = data.azurerm_resource_group.main.name
}
# Get the Identity of a service principal
data "azuread_service_principal" "aks_aci_identity" {
display_name = "aciconnectorlinux-${var.kubernetes_cluster_name}"
depends_on = [module.kubernetes_service_uat]
}
# Assign role to aci identity
module "role_assignment_aci_nodepool_subnet" {
source = "../../../modules/azure/role-assignment"
role_assignment_scope = data.azurerm_subnet.k8s_aci.id
role_definition_name = var.role_definition_name.net-contrib
role_assignment_principal_id = data.azuread_service_principal.aks_aci_identity.id
}
You can also achieve this using the Azure CLI command below:
az role assignment create --assignee <Object (principal) ID> --role "Network Contributor" --scope <subnet-id>
Note: The Object (principal) ID is the ID that you obtained in the error message.
An example is this:
az role assignment create --assignee 560df3e9b-9f64-4faf-aa7c-6tdg779f81c7 --role "Network Contributor" --scope /subscriptions/0237fb7-7530-43ba-96ae-927yhfad80d1/resourcegroups/MyResourceGroup/providers/Microsoft.Network/virtualNetworks/my-vnet/subnets/k8s-aci-node-pool-subnet
Resources:
Aci connector linux should export the identity associated to its addon
Azure Kubernetes Service Tutorial: How to Integrate AKS with Azure Container Instances