Search code examples
c++winapisetthreadaffinitymaskperformancecounter

What good are thread affinity mask changes for the current thread?


I'm writing a game engine and I need a way to get a precise and accurate "deltatime" value from which to derive the current FPS for debug and also to limit the framerate (this is important for our project).

Doing a bit of research, I found out one of the best ways to do this is to use WinAPI's QueryPerformanceCounter function. GetTicksCount has to be used to prevent forward counter leaps, but it in itself is not very accurate.

Now, the problem with QueryPerformanceCounter is that it apparently may return values that would look like if time warped back (i.e. a call may return a value prior in time relative to another call in the past). This happens only when a value obtained with a given processor core is compared against a value obtained with another processor core, which leads me to the ultimate questions that motivated me to make this post:

  1. May the OS "reallocate" a thread to another core while the thread is already running, or is a thread is allocated to a given core and that's that until the thread dies?
  2. If a thread can't be reallocated (and that makes a lot of sense for me, at least), then why is it possible for me to do something like SetThreadAffinityMask(GetCurrentThread(),mask)? Ogre3D does that in its Ogre::Timer class (Windows implementation), and I'm assuming that's to avoid time going back. But for that to be true, then I would have to consider the possibility of threads being moved from one core to another arbitrarily by the OS, which seems rather odd to me (not sure why).

I think that was all I wanted to know for now. Thanks.


Solution

  • Unless a thread has a processor affinity mask, the scheduler will move it from processor to processor in order to give it execution time. Since moving a thread between processors costs performance, it will try not to move it, but giving it a processor to execute on has priority over not moving it. So, usually threads move.

    As for timer apis. timeGetTime is designed for multimedia timing, so it's a bit more accurate than GetTickCount.

    QueryPerformanceCounter(). is still your most precise measurement though. Microsoft has this to say about it.

    On a multiprocessor computer, it should not matter which processor is called. However, you can get different results on different processors due to bugs in the basic input/output system (BIOS) or the hardware abstraction layer (HAL). To specify processor affinity for a thread, use the SetThreadAffinityMask function.

    So if you are doing the timing tests on a specific computer, you may not have to worry about QPC going backwards, you should do some testing and see if it matters on your machine.