Search code examples
azurekubernetesazure-akskubeflow

AKS Cluster with virtual node enabled and without virtual node enabled


I wanted to install Kubeflow into the Azure, So I started off creating an Azure Kubernetes Cluster(AKS) with a single node(B4MS virtual machine). During the installation, I didn't enable the virtual node pool option. After creating the AKS cluster, I ran the command "$ kubectl describe node aks-agentpool-3376354-00000" to check the specs. The allocatable number of Pods were 110 and I was able to install Kubeflow without any issues. However, sometime later I wanted an AKS Cluster with virtual node pool enabled so I could use GPUs for training. So I deleted the old Cluster and created a new AKS Cluster with the same B4MS virtual machine and with the virtual node pool option enabled. This time when I ran the same command as above to describe the node specs, the allocatable number of Pods were 30 and the kubeflow installation failed due to lack of pods to provision.

Can someone explain me why the number of allocatable Pods change when the virtual node option is enabled or disabled? How do I maintain the number of allocatable Pods as 110 while having the virtual node pool option enabled? Thank you in advance!


Solution

  • Virtual Node Pool requires the usage of the Advance Networking configuration of AKS which brings in AZURE CNI network plugin.

    The Default POD count per node on AKS when using AZURE CNI is 30 pods.

    https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni#maximum-pods-per-node

    This is the main reason why you are now getting 30 MAX pods per node.

    This can be updated to a bigger number when using the AZ CLI to provision your cluster.

    https://learn.microsoft.com/en-us/cli/azure/ext/aks-preview/aks?view=azure-cli-latest#ext-aks-preview-az-aks-create

    --max-pods -m
    The maximum number of pods deployable to a node.