Search code examples
google-kubernetes-engineprometheusprometheus-operatorkube-proxykube-prometheus-stack

Getting KubeControllerManager, KubeProxy, KubeScheduler down alert in Kube Prometheus Stack installed in GKE


I just installed the latest kube prometheus stack (kube-prometheus-stack-37.2.0) with default setting in my GKE cluster.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install prometheus prometheus-community/kube-prometheus-stack --namespace monitoring

I started getting three alerts (Getting KubeControllerManager, KubeProxy, KubeScheduler down). By doing some research I found that I need to change the kube proxy's metricsBindAddress to 0.0.0.0:10249 in ConfigMap. But I can't find any kube-proxy or kube-proxy-config in kube-system namespace. Not sure what to do to fix this issue.


Solution

  • I couldn't find any answer on how to fix this issue of prometheus not being able to connect to KubeControllerManager, KubeProxy and KubeScheduler in GKE. So had to disable it for now in prometheus. Here is how I did it in case this helps someone else.

    Create a yaml file (kube-prometheus-stack-overrides.yaml). Add the below content to this file:

    ## Component scraping the kube controller manager
    ##
    kubeControllerManager:
      enabled: false
    ## Component scraping kube proxy
    ##
    kubeProxy:
      enabled: false
      service:
        enabled: true
        port: 10249
        targetPort: 10249
    ## Component scraping kube scheduler
    ##
    kubeScheduler:
      enabled: false
    

    So basically, this file tells prometheus not to worry about these components (this is probably not a good idea but till we find a better solution, this will do!).

    Execute the below command

    helm upgrade -f kube-prometheus-stack-overrides.yaml [release name] prometheus-community/kube-prometheus-stack -n [namespace where stack is installed]