Search code examples
x86intelperfamd-processorenergy

how to access RAPL via perf with Rocket Lake?


I have a Rocket Lake CPU(11900K), but perf does not support access power events with it yet, how can I do it?

The perf events list:

pastebin.com + tcsSdxUx

My OS: Ubuntu 20.10 Kernel 5.12-RC6 perf version: 5.12-RC6

I can read the Rapl value with rapl-read.c (the link: http://web.eece.maine.edu/~vweaver/projects/rapl/)

But rapl-read.c can not use to profiling the runing program. I hope to do profiling the runing program not only power events but also cycles, branch, etc., The SoCwatch from Intel can not do so much things.

Is there any way to add Rocket Lake power events support to perf ? I dont know the raw power events counter.

update #1:

the uname -a output:

Linux u128 5.12.0-051200rc6-generic #202104042231 SMP Sun Apr 4 22:33:57 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

update #2: rapl-read -m output

RAPL read -- use -s for sysfs, -p for perf_event, -m for msr

Found RocketLake Processor type 0 (0), 1 (0), 2 (0), 3 (0), 4 (0), 5 (0), 6 (0), 7 (0)

    Detected 8 cores in 1 packages

Trying /dev/msr interface to gather results

    Listing paramaters for package #0
            Power units = 0.125W
            CPU Energy units = 0.00006104J
            DRAM Energy units = 0.00006104J
            Time units = 0.00097656s

            Package thermal spec: 125.000W
            Package minimum power: 0.000W
            Package maximum power: 0.000W
            Package maximum time window: 0.000000s
            Package power limits are unlocked
            Package power limit #1: 4095.875W for 0.108398s (enabled, not_clamped)
            Package power limit #2: 4095.875W for 0.032227s (disabled, not_clamped)
    PowerPlane1 (on-core GPU if avail) 0 policy: 16


    Sleeping 1 second

    Package 0:
            Package energy: 62.846985J
            PowerPlane0 (cores): 45.371277J
            PowerPlane1 (on-core GPU if avail): 0.000000 J
            DRAM: 0.000000J
            PSYS: -0.000000J

Note: the energy measurements can overflow in 60s or so so try to sample the counters more often than that.

Update #3: I found it is hard to simple use rapl msr to get the whole power consumption:

ujtoj=1000000;
bgn_energy=$(rdmsr -d 0x611);
time sh doit.sh;
end_energy=$(rdmsr -d 0x611);
printf '%.3f\n' "$(((end_energy - bgn_energy)/ujtoj))e-3"

Output:

real    2m58.411s
user    2m58.068s
sys     0m0.168s
0.197

The doit.sh is a shell script to run SPEC CPU2017 500.perlbench test. For Zen 3, the energy consumption(power/energy_pkg) from perf stat output is about 7486.61J, much higher than "0.197" which output from simple use rdmsr.

update #4: Now I had found another way to solve my problem. It is easy to add RKL support by add some "#define" and "case:" code.

--------------------------------------------------------------------------------
CPU name:   11th Gen Intel(R) Core(TM) i9-11900K @ 3.50GHz
CPU type:   Intel Rocketlake processor
CPU clock:  3.50 GHz
--------------------------------------------------------------------------------
Group 1: ENERGY
+-----------------------+---------+--------------+
|         Event         | Counter |  HWThread 6  |
+-----------------------+---------+--------------+
|   INSTR_RETIRED_ANY   |  FIXC0  | 996795747147 |
| CPU_CLK_UNHALTED_CORE |  FIXC1  | 321084408076 |
|  CPU_CLK_UNHALTED_REF |  FIXC2  | 216809163858 |
|       TEMP_CORE       |   TMP0  |           65 |
|     PWR_PKG_ENERGY    |   PWR0  |    4050.2952 |
|     PWR_PP0_ENERGY    |   PWR1  |    2982.4675 |
|    PWR_DRAM_ENERGY    |   PWR3  |            0 |
+-----------------------+---------+--------------+
+----------------------+------------+
|        Metric        | HWThread 6 |
+----------------------+------------+
|  Runtime (RDTSC) [s] |    62.0025 |
| Runtime unhalted [s] |    91.6329 |
|      Clock [MHz]     |  5189.3093 |
|          CPI         |     0.3221 |
|    Temperature [C]   |         65 |
|      Energy [J]      |  4050.2952 |
|       Power [W]      |    65.3247 |
|    Energy PP0 [J]    |  2982.4675 |
|     Power PP0 [W]    |    48.1024 |
|    Energy DRAM [J]   |          0 |
|    Power DRAM [W]    |          0 |
+----------------------+------------+

Solution

  • While the RKL core and uncore perf events were added in v5.11-rc1 and the RAPL-based powercap support was added in v5.9-rc5, the RAPL perf PMU is not yet supported, not even in the latest 5.15.1 kernel version. The RAPL PMU is also not yet supported for TGL, TNT, and LKF. This looks strange to me because it seems that the ADL and SPR RAPL PMUs are already supported. Did they just forget to add support for these other processors? Anyway, you have to use other tools for now.

    Note that for the core PMU events, the perf_event subsystem lets you only use the architectural events if it's running on an unsupported processor model. But you can still use the raw event encoding as documented in the perf man pages. This approach is only reliable for events without constraints because perf_event isn't aware of any constraints that may exist on an unsupported model. Most events don't have constraints, so this isn't a major problem.

    I don't know why you think that rapl-read can't be used to profile a program. There is no program-specific or core-specific RAPL domains. You can run rapl-read with the -m option to directly access MSRs to take energy readings, then your program, then run rapl-read again. The difference between the two readings gives you energy consumption for each of the supported domains. Note that you've to modify the rapl_msr() function so that it invokes your program between the readings instead of just doing sleep(1). Otherwise, it'll just report the energy consumption in about a second with hardly any correlation of the energy consumption of your program.

    rapl-read doesn't currently support RKL (or any of the very recent Intel processors). But you can easily add RAPL support by first determining the CPU model from cat /proc/cpuinfo and then adding a macro definition like #define CPU_ROCKETLAKE model similar to the currently supported models. I see only two switch statements on the CPU mode, one in detect_cpu(void) and one in rapl_msr(int core, int cpu_model). Just add a case for CPU_ROCKETLAKE. RKL has the same RAPL domains as SKL, so place together with CPU_SKYLAKE in both functions. That should do it. Or you can avoid rapl-read altogether and just use wrmsr and rdmsr in a shell script that takes readings, runs the program, and then takes readings again.

    MSR 0x611 is MSR_PKG_ENERGY_STATUS, which reports a 32-bit unsigned value. The unit of this value is MSR_RAPL_POWER_UNIT and the default is 15.26uj. You seem to think it's in micro-joules. Are you sure that this is what MSR_RAPL_POWER_UNIT says? Even then, the result of the expression $(((end_energy - bgn_energy)/ujtoj))e-3 is in kilo-joules, so how are you comparing it with power/energy_pkg on Zen3, which is clearly in joules?

    If the correct unit is 15.26uj, then the measurement on the Intel processor would be 15.26*197000000 = 3,009,226,220,000 joules (about 3000 gigajoules). But since only the lowest 32 bits of the MSR register are valid, the maximum value is 15.26*(2^32 - 1) = 65,541,200,921.7 joules (about 65 gigajoules). So I think the unit is not 15.26uj.

    It seems that the 500.perlbench benchmark with the test input took about 3 minutes to complete. It's hard to know whether MSR_PKG_ENERGY_STATUS has wrapped around or not because the reported number is not negative.

    I think it's better to run 500.perlbench on one core and then run a script on another core that reads MSR_PKG_ENERGY_STATUS every few seconds. For example, you can put rdmsr -d 0x611 in a loop and sleep for some number of seconds in each iteration. Since 500.perlbench takes a relatively long time to complete, you don't have to start both programs at precisely the same time. In this way, you'd mimic the way perf stat -a -I 1000 -e power/energy-pkg/ works had the event power/energy-pkg/ been supported on your kernel on the Intel platform.

    I've discussed the reliability of Intel's RAPL-based energy measurements at: perf power consumption measure: How does it work?. However, I don't know if anyone has validated the accuracy of AMD's RAPL. It's unclear to me to what extent a comparison between Intel's MSR_PKG_ENERGY_STATUS and AMD's Core::X86::Msr::PKG_ENERGY_STAT is meaningful.