Search code examples
apache-sparkkubernetescalicokubernetes-networkpolicygoogle-spark-operator

Calico's network policy can't select kubernetes.default service


I'm using google spark-operator and some calico network policies to protect the namespaces.

The Spark driver pods need to be able to communicate with the kubernetes service in the default namespace to speak with the api-server.
This is what I get :

Operation: [get]  for kind: [Pod]  with name: [xx]  in namespace: [xx]  failed.

The problem is :
Using any kind of network policies blocks communication toward the default namespace. Restoring the connectivity is possible but selecting the kubernetes.default service is still impossible as it is a particular service (has no selectors)... And so you can't communicate with it !


I tried opening communication to all pods in default + kube-system namespace. It's working for all services except kubernetes.default which is still unreachable !

apiVersion: projectcalico.org/v3
kind: NetworkPolicy
metadata:
  name: mynetpol
spec:
  selector: all()
  types:
    - Egress
  egress:

    # Allows comm to kube-system namespace
    - action: Allow
      destination:
        selector: all()
        namespaceSelector: ns == 'kube-system'
    - action: Allow
      source:
        selector: all()
        namespaceSelector: ns == 'kube-system'

   # Allows comm to default namespace
    - action: Allow
      destination:
        selector: all()
        namespaceSelector: ns == 'default'
    - action: Allow
      source:
        selector: all()
        namespaceSelector: ns == 'default'

For some reasons curling kubernetes.default.svc.cluster.local:443 timeouts event though all communication is wide open.


Solution

  • So... In the end...

    Network policies don't work on services that dont target pods, which is the case of this particular kubernetes service sitting quietly in the default namespace. It's a special service that always points to the api-server.


    The solution is to retrieve the api-server's real IP and allow egress-ing to it.

    To find this IP you can use this command :

    kubectl get endpoints --namespace default kubernetes
    

    Courtesy of @Dave McNeill

    Then you can allow this IP in you network policy.

    • If you are using the default netpol API, check Dave's answer https://stackoverflow.com/a/56494510/5512455

    • If you are using the calico policies, which I encourage you to do because the Kube ones sucks, following is the working yaml:

    kind: NetworkPolicy
    metadata:
      name: allow-egress-api-server
    spec:
      selector: all()
      types:
        - Egress
      egress:
    
        # Allow api-server
        - action: Allow
          protocol: TCP
          destination:
            nets:
            - <Your api-server IP>/32
            ports:
            - 6443