prometheus grafana amazon-eks prometheus-operator prometheus-node-exporter

Master Prometheus is not able to scrape the container metrics from EKS cluster in AWS

I have an AWS account with two EKS clusters, say EKS_A and EKS_B. EKS_A is in us-east-1 and EKS_B is in us-west-1 of same AWS account. On these AWS EKS clusters, I have the Prometheus namespace which is running the below pods -

pod/kube-state-metrics
pod/prometheus-alertmanager
pod/prometheus-node-exporter
pod/prometheus-pushgateway
pod/prometheus-server

daemonset.apps/prometheus-node-exporter 

deployment.apps/kube-state-metrics
deployment.apps/prometheus-pushgateway

Now these EKS clusters each are exposing the metrics using their respective endpoints, and these two endpoints are consumed/used by the master Prometheus(which has web UI to show the metrics) which is setup in a different Kubernetes cluster that is not part of AWS.

Now the problem is - the master Prometheus is able to show or scrape all the metrics exposed by the EKS_A cluster in us-east-1, but it is not able to show the container related metrics from the EKS_B cluster in us-west-1.

This means the below container metrics are available in master Prometheus for the EKS_A cluster, but they are not showing for the EKS_B cluster -

container_cpu_cfs_periods_total
container_cpu_cfs_throttled_periods_total
container_cpu_cfs_throttled_seconds_total
container_cpu_load_average_10s
container_cpu_system_seconds_total
container_cpu_usage_seconds_total
container_cpu_user_seconds_total
container_file_descriptors
container_fs_inodes_free
container_fs_inodes_total
container_fs_io_current
container_fs_io_time_seconds_total
container_fs_io_time_weighted_seconds_total
container_fs_limit_bytes
container_fs_read_seconds_total

Please note that Prometheus master UI is able to show all the metrics from the EKS_B cluster, except the above container_* related metrics.

Any idea on why this could be happening and how I need to resolve it?

Thank you

Solution

cAdvisor is used to monitor resource usage and analyzes the performance of containers. In the Prometheus config file instead of using the name as cadvisor, I used Kubernetes-cadvisor which caused this issue. After changing Kubernetes-cadvisor to cadvisor the issue got resolved.