Search code examples
kubernetesgoogle-kubernetes-engineprometheus-node-exporter

Prometheus (node_exporter) issue when update from GKE 1.15 to 1.16


I'm using Prometheus and Grafana applications on Kubernetes in Google GKE since many months. For example, on Grafana I used to monitor container_cpu_usage_seconds_total.

But since I upgraded my nodes of GKE from 1.15 to 1.16, I have lost container_* information.

To test it, I have created a new cluster with the 1.15 version. I installed Prometheus from the Google Marketeplace and upgraded GKE step by step until the issue appears. Again, the container_* monitoring stopped with version 1.16.

Here you can see container_cpu_usage_seconds_total and it stopped when I upgrade the node. There are 3 nodes

Am I the only one with this issue? Has anyone found a solution?

Thanks for your help :)

Valentin


Solution

  • I found what was going wrong. With docker or kubernetes, node-exporter don't send pods metrics ( container_* ). Cadvisor must be installed (In Google Marketeplace, Cadvisor is installed in node-exporter image) Since Kubernetes 1.16, Cadvisor's configuration is wrong. You should edit the configuration to solve the issue

    All informations are in this post : Prometheus not receiving metrics from cadvisor in GKE