I'm trying to understand the context-switching rate on my system (running on AWS EC2), and where the switches are coming from. Just getting the number is already confusing, as two tools that I know can output such a metric give me different results. Here's the output from vmstat:
$ vmstat -w 2
procs -------------------memory------------------ ---swap-- -----io---- --system-- -----cpu-------
r b swpd free buff cache si so bi bo in cs us sy id wa st
8 0 0 443888 492304 8632452 0 0 0 1 0 0 14 2 84 0 0
37 0 0 444820 492304 8632456 0 0 0 20 131602 155911 43 5 52 0 0
8 0 0 445040 492304 8632460 0 0 0 42 131117 147812 46 4 50 0 0
13 0 0 446572 492304 8632464 0 0 0 34 129154 142260 49 4 46 0 0
The number is ~140k-160k/sec.
But perf tells something else:
$ sudo perf stat -a
Performance counter stats for 'system wide':
2980794.013800 cpu-clock (msec) # 35.997 CPUs utilized
12,335,935 context-switches # 0.004 M/sec
2,086,162 cpu-migrations # 0.700 K/sec
11,617 page-faults # 0.004 K/sec
...
0.004 M/sec is apparently 4k/sec.
Why is there a disparity between the two tools? Am I misinterpreting something in either of them, or are their CS metrics somehow different?
FWIW, I've tried doing the same on a machine running a different workload, and the difference there is even twice larger.
Environment:
Some recent versions of perf
have a unit-scaling bug in the printing code. Manually do 12.3M / wall-time and see if that's sane. (spoiler alert: it is according to OP's comment.)
https://lore.kernel.org/patchwork/patch/1025968/
Commit 0aa802a79469 ("perf stat: Get rid of extra clock display function") introduced the bug in mainline Linux 4.19-rc1 or so.
Thus,
perf_stat__update_shadow_stats()
now saves scaled values of clock events in msecs, instead of original nsecs. But while calculating values of shadow stats we still consider clock event values in nsecs. This results in a wrong shadow stat values.
Commit 57ddf09173c1 on Mon, 17 Dec 2018 fixed it in 5.0-rc1, eventually being released with perf
upstream version 5.0.
Vendor kernel trees that cherry-pick commits for their stable kernels might have the bug or have fixed the bug earlier.