Search code examples
performancetimex86instructions

Instruction to get the current time on x86


Is there an x86 instruction to get the current time?

Basically... something like a replacement for clock_get_time ... something with the minimum overhead... where I don't really care about getting the time in any specific format... as long as it's a format I can use.

Basically I'm doing some work to "Detect how much PHYSICAL REAL LIFE TIME" has gone by... and I want to be able to measure time as frequently as possible!

I guess you can imagine i'm doing something like a profiling app... :)

I really need aggressively efficient access to the hardware time. So ideally... some ASM to get the time... store it somewhere... then massage it later into some format that I can actually process.

I'm not interested in _rdtsc as that measures the number of cycles gone by. I need to know how much physical time has executed... not cycles which can vary due to thermal fluctations or so..


Solution

  • For profiling, often it's most useful to profile in terms of CPU clock cycles, rather than wall-clock time. CPU dynamic clocking (turbo and power saving) makes it annoying to get the CPU ramped up to full speed before the start of a measurement period.

    If you still need wall-clock time after that:

    Recent x86 CPUs have a TSC that runs at a fixed rate, regardless of CPU frequency adjustment for power-saving. Also, the TSC doesn't stop when the CPU is halted. (i.e. no work to do, so it ran the HLT instruction to wait for an interrupt in low-power mode.)

    It turned out that efficient access to a useful time-source was more useful to have in hardware than an actual clock cycle counter, so that's what RDTSC morphed into, a few CPU generations after its introduction. Now we're back to using hardware performance counters for measuring clock cycles.

    In Linux, look for constant_tsc and nonstop_tsc in the CPU features flags in /proc/cpuinfo. IDK if there are CPUID bits for those. If no, use Linux's code for it (if you can use GPLed code).

    On a CPU with those two key features, Linux uses the TSC as its clocksource, IIRC.

    The lowest overhead way to get the current time in user-space will be to work out the conversion between RDTSC ticks and real time. While profiling, you might just store 64bit TSC snapshots, and convert to real-time later. (So you can handle TSC wraparound then). RDTSC only takes about 24 cycles (Agner Fog's instruction table, Intel Haswell). I think the overhead of a system call will be an order of magnitude higher than that. (The kernel will have to do a RDTSC in there somewhere anyway).

    Agner Fog has documented his profiling / timing methods, and has some example code. I haven't looked recently, but it might have useful stuff for this application.