Search code examples
c++windowsmultithreadingwinapithreadpool

SetThreadpoolTimerEx occasionally does not invoke callback


I am experiencing a rare issue where SetThreadpoolTimerEx does not invoke the callback function as expected.

Environment:

Operating System: Windows 10 22H2 (Build 19045.5131)

Visual Studio: 2022 (Version 17.12.0)

SDK and Toolset: Windows SDK 10.0.19041.0, Platform Toolset v143

Issue:

According to the documentation for SetThreadpoolTimerEx (link):

If the timer's previous state was "set", and the function returns FALSE, then a callback is in progress or about to commence.

However, in my case, the function returns FALSE, but the callback is not being called. This causes my program to hang while waiting for the callback to signal completion.

C++ Code to Reproduce:

#include <iostream>
#include <atomic>
#include <chrono>
#include <Windows.h>

std::atomic<bool> callbackDone = false;

VOID(NTAPI callback)(
    _Inout_     PTP_CALLBACK_INSTANCE Instance,
    _Inout_opt_ PVOID                 Context,
    _Inout_     PTP_TIMER             Timer
    )
{
    printf("callback...\n");
    callbackDone = true;
}

int main() {
    bool canFail = false;

    while (true) {
        callbackDone = false;
        auto timer = CreateThreadpoolTimer(callback, nullptr, nullptr);

        LARGE_INTEGER relativeTime = {};

        relativeTime.QuadPart = 10000000 * -1;

        FILETIME relativeTimeFt = {};

        relativeTimeFt.dwLowDateTime = relativeTime.LowPart;
        relativeTimeFt.dwHighDateTime = static_cast<DWORD>(relativeTime.HighPart);

        auto setRes1 = SetThreadpoolTimerEx(timer, &relativeTimeFt, 0, 0);

        Sleep(999);

        auto setRes2 = SetThreadpoolTimerEx(timer, nullptr, 0, 0);

        if (setRes2 == TRUE) {
            FILETIME now = {};
            SetThreadpoolTimerEx(timer, &now, 0, 0);
        }
        else {
            if (!callbackDone) {
                WaitForThreadpoolTimerCallbacks(timer, FALSE);
                canFail = true;
            }
        }

        if (canFail) {
            printf("waiting for possibly failed timer\n");
        }

        auto startWaitTime = std::chrono::high_resolution_clock::now();
        auto waitMsgStep = std::chrono::seconds(10);
        auto nextMsgTime = waitMsgStep;
        while (!callbackDone) {
            SleepEx(10, TRUE);

            auto waitDuration = std::chrono::duration_cast<std::chrono::seconds>(
                std::chrono::high_resolution_clock::now() - startWaitTime
            );

            if (waitDuration > nextMsgTime) {
                printf("waiting for callback to finish for %d seconds...\n", static_cast<DWORD>(nextMsgTime.count()));
                nextMsgTime += waitMsgStep;
            }
        }

        if (canFail) {
            canFail = false;
            printf("timer is ok\n");
        }

        CloseThreadpoolTimer(timer);
    }
}

Possible Output:

enter image description here

Problem:

Despite SetThreadpoolTimerEx returning FALSE—which, according to the documentation, indicates that the callback is in progress or about to commence—the callback function is not being called in some rare cases. This causes the program to hang indefinitely in the loop waiting for callbackDone to become true.

Questions:

  1. Is there a known issue with SetThreadpoolTimerEx where the callback might not be invoked even if the function returns FALSE?

  2. Am I misusing the thread pool timer APIs in a way that could cause this behavior?

  3. How can I modify my code to ensure that the callback is always invoked or to correctly handle the case when it isn't?

Additionally, from Raymond Chen's blog:

The last row is the interesting one: When you cancel the timer or wait object, the thread pool tries to recall any pending callbacks, but sometimes a callback has already gone too far and could not be recalled. For example, the callback could be already in progress. In that case, the Set...Ex function returns FALSE to tell you that you're not finished yet. You have to wait for the callback to complete before everything is finally done.

In my case, SetThreadpoolTimerEx returns FALSE, and I wait for 10 seconds or more, but the callback never invokes.


Solution

  • Yes, this is a Windows bug. It exists at least in 23H2 22631.2861 too, where I can reproduce it.

    Minimum working code can be:

    VOID NTAPI OnTimer(
                       _Inout_     PTP_CALLBACK_INSTANCE /*Instance*/,
                       _Inout_opt_ PVOID                 Context,
                       _Inout_     PTP_TIMER             /*Timer*/
                       )
    {
        *(bool*)Context = true;
    }
    
    void Test01(LONG dwMilliseconds = 1)
    {
        bool callbackDone;
    
        if (PTP_TIMER Timer = CreateThreadpoolTimer(OnTimer, &callbackDone, 0))
        {
            LARGE_INTEGER DueTime = { - (dwMilliseconds * 10000), -1 };
    
            ULONG n = 0x100;
    
            do 
            {
                callbackDone = false;
                SetThreadpoolTimerEx(Timer, (PFILETIME)&DueTime, 0, 0);
    
                Sleep(dwMilliseconds);
    
                if (!SetThreadpoolTimerEx(Timer, 0, 0, 0))
                {
                    if (!callbackDone)
                    {
                        WaitForThreadpoolTimerCallbacks(Timer, FALSE);
    
                        if (!callbackDone)
                        {
                            __debugbreak();
                            break;
                        }
                    }
                }
    
            } while (--n);
    
            CloseThreadpoolTimer(Timer);
        }
    }
    

    SetThreadpoolTimerEx returns the result of TppCancelTimer - so, are timer canceled or not. If the timer was set before, but not canceled, it must be executed. And after a call to WaitForThreadpoolTimerCallbacks, callbackDone must already be set. But here there is a race condition in the implementation: if we call SetThreadpoolTimerEx(Timer, 0, 0, 0) when the timer already has fired, but the user callback has not yet been called, say at TppSingleTimerExpiration point (this is internal function, not exported), the SetThreadpoolTimerEx ( TppCancelTimer) returns false, but the user callback will be not called in this case. We can set a hook on TppSingleTimerExpiration and delay its execution, in order to reproduce the race:

    void (NTAPI *TppSingleTimerExpiration)(PTP_TIMER Timer, PSRWLOCK SRWLock, BOOLEAN b);
    
    HANDLE _G_hEvent = CreateEvent(0, 0, 0, 0);
    
    void Hook_TppSingleTimerExpiration(PTP_TIMER Timer, PSRWLOCK SRWLock, BOOLEAN b)
    {
        DbgPrint("Hook_TppSingleTimerExpiration(%p) [0]\n", Timer);
    
        SetEvent(_G_hEvent);
        Sleep(2000);
    
        DbgPrint("Hook_TppSingleTimerExpiration(%p) [1]\n", Timer);
    
        DetourTransactionBegin();
        DetourDetach((void**)&TppSingleTimerExpiration, Hook_TppSingleTimerExpiration);
        DetourTransactionCommit();
    
        TppSingleTimerExpiration(Timer, SRWLock, b);
    
        DbgPrint("Hook_TppSingleTimerExpiration(%p) [2]\n", Timer);
    }
    
    VOID NTAPI OnTimer(
                       _Inout_     PTP_CALLBACK_INSTANCE /*Instance*/,
                       _Inout_opt_ PVOID                 Context,
                       _Inout_     PTP_TIMER             /*Timer*/
                       )
    {
        *(bool*)Context = true;
    }
    
    void Test02()
    {
        if (_G_hEvent = CreateEvent(0, 0, 0, 0))
        {
            // !! need look for real address on target system
            (ULONG_PTR&)TppSingleTimerExpiration = 0x00007FFCB0FA081C;
            DetourTransactionBegin();
            DetourAttach((void**)&TppSingleTimerExpiration, Hook_TppSingleTimerExpiration);
            DetourTransactionCommit();
    
            bool callbackDone = false;
            if (PTP_TIMER Timer = CreateThreadpoolTimer(OnTimer, &callbackDone, 0))
            {
                LARGE_INTEGER DueTime = { (ULONG)-1, -1 };
                SetThreadpoolTimerEx(Timer, (PFILETIME)&DueTime, 0, 0);
    
                WaitForSingleObject(_G_hEvent, INFINITE);
    
                if (!SetThreadpoolTimerEx(Timer, 0, 0, 0))
                {
                    if (!callbackDone)
                    {
                        WaitForThreadpoolTimerCallbacks(Timer, FALSE);
    
                        if (!callbackDone)
                        {
                            __debugbreak();
                        }
    
                        Sleep(2000);
    
                        if (!callbackDone)
                        {
                            __debugbreak();
                        }
                    }
                }
    
                CloseThreadpoolTimer(Timer);
            }
    
            CloseHandle(_G_hEvent);
        }
    }