Search code examples
kuberneteskubernetes-ingressazure-aksnginx-ingress

Routing external traffic through a load balancer to an ingress or through an ingress only on aks?


I have an AKS cluster with its LoadBalancer configured (following https://learn.microsoft.com/en-us/azure/aks/internal-lb ) so that it gets the IP from a PublicIP (all provisioned with Terraform) and targets the cluster ingress deployed with Helm.

resource "kubernetes_service" "server-loadbalacer" {
  metadata {
    name = "server-loadbalacer-svc"
    annotations = {
      "service.beta.kubernetes.io/azure-load-balancer-resource-group" = "fixit-resource-group"
    }
  }
  spec {
    type = "LoadBalancer"
    load_balancer_ip = var.public_ip_address
    selector = {
      name = "ingress-service"
    }
    port {
      name = "server-port"
      protocol = "TCP"
      port = 8080
    }
  }
}

Then with Helm I deploy a Node.js server listening on port 3000, a MongoDB replica set, and a Neo4 cluster. I set up a service for the server receiving on port 3000 and targeting port 3000.

apiVersion: v1
kind: Service
metadata:
  name: server-clusterip-service
spec:
  type: ClusterIP
  selector:
    app: fixit-server-pod
  ports:
    - name: server-clusterip-service
      protocol: TCP
      port: 3000 # service port
      targetPort: 3000 # por on whic the app is listening to

Then the Ingress redirects traffic to the correct service eg. server

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-service
  annotations:
    kubernetes.io/ingress.class: nginx
  labels:
    name: ingress-service
spec:
  rules:
  - host: fixit.westeurope.cloudapp.azure.com #dns from Azure PublicIP
    http:
      paths:
        - path: '/server/*'
          pathType: Prefix
          backend:
            service:
              name: server-clusterip-service
              port:
                number: 3000
        - path: '/neo4j/*'
          pathType: Prefix
          backend: 
            service:
              name: fixit-cluster
              port:
                number: 7687
                number: 7474
                number: 7473
        - path: '/neo4j-admin/*'
          pathType: Prefix
          backend:
            service: 
              name: fixit-cluster-admin
              port:
                number: 6362
                number: 7687
                number: 7474
                number: 7473

I'm expecting to go to http://fixit.westeurope.cloudapp.azure.com:8080/server/api and see the message that the server response for the endpoint /api, but it fails at browser timeout. Pods and services deployed on the cluster are

vincenzocalia@vincenzos-MacBook-Air helm_charts % kubectl get pod
NAME                                           READY   STATUS    RESTARTS   AGE
fixit-cluster-0                                1/1     Running   0          27m
fixit-server-868f657b64-hvmxq                  1/1     Running   0          27m
mongo-rs-0                                     2/2     Running   0          27m
mongodb-kubernetes-operator-7c5666c957-sscsf   1/1     Running   0          4h35m
vincenzocalia@vincenzos-MacBook-Air helm_charts % kubectl get svc  
NAME                       TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)                               AGE
fixit-cluster              ClusterIP      10.0.230.247   <none>         7687/TCP,7474/TCP,7473/TCP            27m
fixit-cluster-admin        ClusterIP      10.0.132.24    <none>         6362/TCP,7687/TCP,7474/TCP,7473/TCP   27m
kubernetes                 ClusterIP      10.0.0.1       <none>         443/TCP                               4h44m
mongo-rs-svc               ClusterIP      None           <none>         27017/TCP                             27m
server-clusterip-service   ClusterIP      10.0.242.65    <none>         3000/TCP                              27m
server-loadbalacer-svc     LoadBalancer   10.0.149.160   52.174.18.27   8080:32660/TCP                        4h41m

The ingress is deployed as

vincenzocalia@vincenzos-MacBook-Air helm_charts % kubectl describe ingress ingress-service
Name:             ingress-service
Labels:           app.kubernetes.io/managed-by=Helm
                  name=ingress-service
Namespace:        default
Address:          
Ingress Class:    <none>
Default backend:  <default>
Rules:
  Host                                 Path  Backends
  ----                                 ----  --------
  fixit.westeurope.cloudapp.azure.com  
                                       /server/*        server-clusterip-service:3000 (<none>)
                                       /neo4j/*         fixit-cluster:7473 (<none>)
                                       /neo4j-admin/*   fixit-cluster-admin:7473 (<none>)
Annotations:                           kubernetes.io/ingress.class: nginx
                                       meta.helm.sh/release-name: fixit-cluster
                                       meta.helm.sh/release-namespace: default
Events:                                <none>

and the server service is

vincenzocalia@vincenzos-MacBook-Air helm_charts % kubectl describe svc server-clusterip-service
Name:              server-clusterip-service
Namespace:         default
Labels:            app.kubernetes.io/managed-by=Helm
Annotations:       meta.helm.sh/release-name: fixit-cluster
                   meta.helm.sh/release-namespace: default
Selector:          app=fixit-server-pod
Type:              ClusterIP
IP Family Policy:  SingleStack
IP Families:       IPv4
IP:                10.0.160.206
IPs:               10.0.160.206
Port:              server-clusterip-service  3000/TCP
TargetPort:        3000/TCP
Endpoints:         10.244.0.15:3000
Session Affinity:  None
Events:            <none>

I tried setting the paths with and without /* but it won't connect in either case. Is this setup even the right way to route external traffic to the cluster or should I use just the ingress? I see that this setup has been given as the solution (1st answer) to this question Kubernetes Load balancer without Label Selector and dough it looks like we're in the same situation, I'm on AKS, and the Azure docs https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli are making me have doubts about my current setup. Can you spot what I'm setting up wrongly if this setup is not a nonsense? Many many thanks for the help.

UPDATE

as mentioned here https://learnk8s.io/terraform-aks the option http_application_routing_enabled = true in cluster creation installs addons

vincenzocalia@vincenzos-MacBook-Air helm_charts % kubectl get pods -n kube-system | grep addon
addon-http-application-routing-external-dns-5d48bdffc6-q98nx      1/1     Running   0          26m
addon-http-application-routing-nginx-ingress-controller-5bcrf87   1/1     Running   0          26m

so the Ingress service should point to that controller in its annotations and not specify a host, so I changed the ingress service to

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-service
  annotations:
    # kubernetes.io/ingress.class: nginx
    kubernetes.io/ingress.class: addon-http-application-routing

    # nginx.ingress.kubernetes.io/rewrite-target: /


  labels:
    name: ingress-service
spec:
  rules:
  # - host: fixit.westeurope.cloudapp.azure.com #server.com 
    - http:
       paths:
         - path: '/server/*' # service
        #  - path: '/server' # service doesn't get a IPaddress
         # - path: '/*'
         # - path: '/'
           pathType: Prefix
           backend:
             service:
               name: server-clusterip-service
               port:
                 number: 3000
        # - path: '/neo4j/*'
        #   pathType: Prefix
        #   backend: 
        #     service:
        #       name: fixit-cluster
        #       port:
        #         number: 7687
        #         number: 7474
        #         number: 7473
        # - path: '/neo4j-admin/*'
        #   pathType: Prefix
        #   backend:
        #     service: 
        #       name: fixit-cluster-admin
        #       port:
        #         number: 6362
        #         number: 7687
        #         number: 7474
        #         number: 7473

and its output is now

vincenzocalia@vincenzos-MacBook-Air helm_charts % kubectl get ingress
NAME              CLASS    HOSTS   ADDRESS          PORTS   AGE
ingress-service   <none>   *       108.143.71.248   80      7s
vincenzocalia@vincenzos-MacBook-Air helm_charts % kubectl describe ingress ingress-service 
Name:             ingress-service
Labels:           app.kubernetes.io/managed-by=Helm
                  name=ingress-service
Namespace:        default
Address:          108.143.71.248
Ingress Class:    <none>
Default backend:  <default>
Rules:
  Host        Path  Backends
  ----        ----  --------
  *           
              /server/*   server-clusterip-service:3000 (10.244.0.21:3000)
Annotations:  kubernetes.io/ingress.class: addon-http-application-routing
              meta.helm.sh/release-name: fixit-cluster
              meta.helm.sh/release-namespace: default
Events:
  Type    Reason  Age                From                      Message
  ----    ------  ----               ----                      -------
  Normal  Sync    20s (x2 over 27s)  nginx-ingress-controller  Scheduled for sync

now going to http://108.143.71.248/server/api in the browser shows an Nginx 404 page.


Solution

  • I finally found the problem. It was my setup. I was using the default ingress-controller and load balancer that get created when you set the option http_application_routing_enabled = true on cluster creation which the docs are discouraging for production https://learn.microsoft.com/en-us/azure/aks/http-application-routing. So the proper implementation is to install an ingress controller https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli, which hooks up the the internal load balancer, so there is no need to create one. Now, the Ingress controller will accept an ip address for the load balancer, but you have to create the PublicIP it in the node resource group because is going to look for it there and not in the resource group check the difference between the two here https://learn.microsoft.com/en-us/azure/aks/faq#why-are-two-resource-groups-created-with-aks.

    So the working configuration is now:

    main

    terraform {
      required_version = ">=1.1.0"
      required_providers {
        azurerm = {
          source = "hashicorp/azurerm"
           version = "~> 3.0.2"
        }
      }
    }
    
    provider "azurerm" {
      features {
        resource_group {
          prevent_deletion_if_contains_resources = false
        }
      }
      subscription_id   = var.azure_subscription_id
      tenant_id         = var.azure_subscription_tenant_id
      client_id         = var.service_principal_appid
      client_secret     = var.service_principal_password
    }
    
    
    provider "kubernetes" {
      host = "${module.cluster.host}"
      client_certificate = "${base64decode(module.cluster.client_certificate)}"
      client_key = "${base64decode(module.cluster.client_key)}"
      cluster_ca_certificate = "${base64decode(module.cluster.cluster_ca_certificate)}"
    }
    
    provider "helm" {
      kubernetes {
        host                   = "${module.cluster.host}"
        client_certificate     = "${base64decode(module.cluster.client_certificate)}"
        client_key             = "${base64decode(module.cluster.client_key)}"
        cluster_ca_certificate = "${base64decode(module.cluster.cluster_ca_certificate)}"
      }
    }
    
    
    
    module "cluster" {
      source = "./modules/cluster"
      location = var.location
      vm_size = var.vm_size
      resource_group_name = var.resource_group_name
      node_resource_group_name = var.node_resource_group_name
      kubernetes_version = var.kubernetes_version
      ssh_key = var.ssh_key
      sp_client_id = var.service_principal_appid
      sp_client_secret = var.service_principal_password
    }
    
    
    
    
    module "ingress-controller" {
      source = "./modules/ingress-controller"
      public_ip_address = module.cluster.public_ip_address
      depends_on = [
        module.cluster.public_ip_address
      ]
    }
    

    cluster

    resource "azurerm_resource_group" "resource_group" {
      name     = var.resource_group_name
      location = var.location
        tags = {
        Environment = "test"
        Team = "DevOps"
      }
    }
    resource "azurerm_kubernetes_cluster" "server_cluster" {
      name                = "server_cluster"
      ### choose the resource goup to use for the cluster
      location            = azurerm_resource_group.resource_group.location
      resource_group_name = azurerm_resource_group.resource_group.name
      ### decide the name of the cluster "node" resource group, if unset will be named automatically 
      node_resource_group = var.node_resource_group_name
      dns_prefix          = "fixit"
      kubernetes_version = var.kubernetes_version
      # sku_tier = "Paid"
    
      default_node_pool {
        name       = "default"
        node_count = 1
        min_count = 1
        max_count = 3
        vm_size = var.vm_size
    
        type = "VirtualMachineScaleSets"
        enable_auto_scaling = true
        enable_host_encryption = false
        # os_disk_size_gb = 30
      }
    
      service_principal {
        client_id = var.sp_client_id
        client_secret = var.sp_client_secret
      }
    
      tags = {
        Environment = "Production"
      }
    
      linux_profile {
        admin_username = "azureuser"
        ssh_key {
            key_data = var.ssh_key
        }
      }
      network_profile {
          network_plugin = "kubenet"
          load_balancer_sku = "basic" 
        
      }
      http_application_routing_enabled = false
      depends_on = [
        azurerm_resource_group.resource_group
      ]
    }
    
    resource "azurerm_public_ip" "public-ip" {
      name                = "fixit-public-ip"
      location            = var.location
      # resource_group_name = var.resource_group_name
      resource_group_name = var.node_resource_group_name
      allocation_method   = "Static"
      domain_name_label = "fixit"
      # sku = "Standard"
    
    depends_on = [
      azurerm_kubernetes_cluster.server_cluster
    ]
    }
    

    ingress controller

    resource "helm_release" "nginx" {
      name      = "ingress-nginx"
      repository = "ingress-nginx"
      chart     = "ingress-nginx/ingress-nginx"
      namespace = "default"
    
      set {
        name  = "controller.service.externalTrafficPolicy"
        value = "Local"
      }
    
      set {
        name = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-internal"
        value = "true"
      }
    
      set {
        name  = "controller.service.loadBalancerIP"
        value = var.public_ip_address
      }
    
      set {
        name  = "controller.service.annotations.service.beta.kubernetes.io/azure-load-balancer-health-probe-request-path"
        value = "/healthz"
      }
    } 
    

    ingress service

    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: ingress-service
      # namespace: default
      annotations:
        nginx.ingress.kubernetes.io/ssl-redirect: "false"
        nginx.ingress.kubernetes.io/use-regex: "true"
        nginx.ingress.kubernetes.io/rewrite-target: /$2$3$4
    spec:
      ingressClassName: nginx
      rules:
        # - host: fixit.westeurope.cloudapp.azure.com #dns from Azure PublicIP
    
    
    ### Node.js server
      - http:
          paths:
          - path: /(/|$)(.*)
            pathType: Prefix
            backend:
              service:
                name: server-clusterip-service
                port:
                  number: 80 
    
      - http:
          paths:
          - path: /server(/|$)(.*)
            pathType: Prefix
            backend:
              service:
                name: server-clusterip-service
                port:
                  number: 80
    ...
    
    other services omitted
    

    Hope this can help getting the setup right. Cheers.