Understanding the CPU Busy Prometheus query

I am new to Grafana and Prometheus. I have read a lot of documentation and now I"m trying to work backwards by reviewing some existing queries and making sure I understand them

I have downloaded the Node Exporter Full dashboard (https://grafana.com/grafana/dashboards/1860). I have been reviewing the CPU Busy query and I"m a bit confused. I am quoting it below, spaced out so we can see the nested sections better:

In this query, job is node-exporter while instance is the IP and port of the server. This is my base understanding of the query: node_cpu_seconds_total is a counter of the number of seconds the CPU took at a given sample.

Line 5: Get cpu seconds at a given instant, broken down by the individual CPU cores
Line 4: Add up all CPU seconds across all cores
Line 3: Why is there an additional count()? Does it do anything?
Line 12: Rate vector - get cpu seconds of when the cpu was idle over the given rate period
Line 11: Take a rate to transfer that into the rate of change of cpu seconds (and return an instant vector)
Line 10: Sum up all rates, broken down by CPU modes
Line 9: Take the single average rate across all CPU mode rates
Line 8: Subtract the average rate of change (Line 9) from total CPU seconds (Line 3)
Line 16: Multiple by 100 to convert minutes to seconds 10: Line 18-20: Divide Line 19 by the count of the count of all CPU seconds across all CPUs

My questions are as follows:

I would have thought that CPU usage would simply be (all non idle cpu usage) / (total cpu usage). I therefore don't understand why take into account rate at all (#6 and #8)
The numerator here seems to be trying to get all non-idle usage and does so by getting the full sum and subtracting the idle time. But why does one use count and the other sum?
If we grab cpu seconds by filtering by mode=idle, then does adding the by (mode) add anything? There is only one mode anyways? My understanding of by (something) is more relevant when there are multiple values and we group the values by that category (as we do by cpu in this query)
Lastly, as mentioned in bold above, what is with the double count(), in the numerator and denominator?

Solution

Both of these count functions return the amount of CPU cores. If you take them out of this long query and execute, it'll immediately make sense:

count by (cpu) (node_cpu_seconds_total{instance="foo:9100"})

# result:
{cpu="0"} 8
{cpu="1"} 8

By putting the above into another count() function, you will get a value of 2, because there are just 2 metrics in the dataset. At this point, we can simplify the original query to this:

(
  NUM_CPU
  -
  avg(
    sum by(mode) (
      rate(node_cpu_seconds_total{mode="idle",instance="foo:9100"}[1m])
    )
  )
  * 100
)
/ NUM_CPU

The rest, however, is somewhat complicated. This:

    sum by(mode) (
      rate(node_cpu_seconds_total{mode="idle",instance="foo:9100"}[1m])
    )

... is essentially the sum of idle time of all CPU cores (I'm intentionally skipping the context of time to make it simpler). It's not clear why there is by (mode), since the rate function inside has a filter, which makes it possible for only idle mode to appear. With or without by (mode) it returns just one value:

# with by (mode)
{mode="idle"} 0.99

# without
{} 0.99

avg() on top of that makes no sense at all. I assume, that the intention was to get the amount of idle time per CPU (by (cpu), that is). In this case it starts to make sense, although it is still unnecessary complex. Thus, at this point we can simplify the query to this:

(NUM_CPU - IDLE_TIME_TOTAL * 100) / NUM_CPU

I don't know why it is so complicated, you can get the same result with a simple query like this:

100 * (1 - avg(rate(node_cpu_seconds_total{mode="idle", instance="foo:9100"}[1m])))