I have Prometheus setup via Helm from Terraform and it's is configured to connect to my Kubernetes cluster. I open my Prometheus but I am not sure which metric to choose from the list to be able to view the CPU/MEM of running pods/jobs. Here are all the pods running with the command (test1 is the kube namespace):
kubectl -n test1 get pods
When, I am on Prometheus, I see many metrics related to CPU, but not sure which one to choose:
I tried to choose one, but the namespace = prometheus and it uses prometheus-node-exporter
and I don't see my cluster or my namespace test1
anywhere here.
Could you please help me? Thank you very much in advance.
UPDATE SCREENSHOT
UPDATE SCREENSHOT
I need to concentrate on this specific namespace, normally with the command:
kubectl get pods --all-namespaces | grep hermatwin
I see the first line with namespace = jobs
I think this is namespace.
No result when set calendar to last Friday:
UPDATE SCREENSHOT April 20 I tried to select 2 days with starting date on last Saturday 17 April but I don't see any result:
ANd, if I remove (namespace="jobs") condition, I don't see any result either:
I tried to rerun the job (simulation jobs) again just now and tried to execute the prometheus query while the job was still running mode but I don't get any result :-( Here you can see my jobs where running.
When using simple filter, just container_cpu_usage_seconds_total
, I can see the namespace="jobs"
node_cpu_seconds_total
is a metric from node-exporter
, the exporter that brings machine statistics and its metrics are prefixed with node_
. You need metrics from cAdvisor
, this one produces metrics related to containers and they are prefixed with container_
:
container_cpu_usage_seconds_total
container_cpu_load_average_10s
container_memory_usage_bytes
container_memory_rss
Here are some basic queries for you to get started. Be ready that they may require tweaking (you may have different label names):
sum(irate(container_cpu_usage_seconds_total{container!="POD", container=~".+"}[2m])) by (pod)
sum(container_memory_usage_bytes{container!="POD", container=~".+"}) by (pod)
Beware that pods with host
network mode (not isolated) show traffic rate for the whole node. * 8
is to convert bytes to bits for convenience (MBit/s, GBit/s, etc).
# incoming
sum(irate(container_network_receive_bytes_total[2m])) by (pod) * 8
# outgoing
sum(irate(container_network_transmit_bytes_total[2m])) by (pod) * 8