Search code examples
cassemblymulticorecpu-registersmicroprocessors

CPU TSC fetch operation especially in multicore-multi-processor environment


In Linux world, to get nano seconds precision timer/clockticks one can use :

#include <sys/time.h>

int foo()
{
   timespec ts;

   clock_gettime(CLOCK_REALTIME, &ts); 
   //--snip--      
}

This answer suggests an asm approach to directly query for the cpu clock with the RDTSC instruction.

In a multi-core, multi-processor architecture, how is this clock ticks/timer value synchronized across multiple cores/processors? My understanding is that there in inherent fencing being done. Is this understanding correct?

Can you suggest some documentation that would explain this in detail? I am interested in Intel Nehalem and Sandy Bridge microarchitectures.

EDIT

Limiting the process to a single core or cpu is not an option as the process is really huge(in terms of resources consumed) and would like to optimally utilize all the resources in the machine that includes all the cores and processors.

Edit

Thanks for the confirmation that the TSC is synced across cores and processors. But my original question is how is this synchronization done ? is it with some kind of fencing ? do you know of any public documentation ?

Conclusion

Thanks for all the inputs: Here's the conclusion for this discussion: The TSCs are synchronized at the initialization using a RESET that happens across the cores and processors in a multi processor/multi core system. And after that every Core is on their own. The TSCs are kept invariant with a Phase Locked Loop that would normalize the frequency variations and thus the clock variations within a given Core and that is how the TSC remain in sync across cores and processors.


Solution

  • On newer CPUs (i7 Nehalem+ IIRC) the TSC is synchronzied across all cores and runs a constant rate. So for a single processor, or more than one processor on a single package or mainboard(!) you can rely on a synchronzied TSC.

    From the Intel System Manual 16.12.1

    The time stamp counter in newer processors may support an enhancement, referred to as invariant TSC. Processors support for invariant TSC is indicated by CPUID.80000007H:EDX[8]. The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states. This is the architectural behavior moving forward.

    On older processors you can not rely on either constant rate or synchronziation.

    Edit: At least on multiple processors in a single package or mainboard the invariant TSC is synchronized. The TSC is reset to zero at a /RESET and then ticks onward at a constant rate on each processor, without drift. The /RESET signal is guaranteed to arrive at each processor at the same time.