Why does my C# code stall when calling back into C++ COM until Task.Wait/Thread.Join?

I have a native C++ application calling into a C# module, which is supposed to run it's own program loop and pass messages back to C++ through a supplied callback object, using COM. I have an existing application to work from but mine has a weird bug.

Skip to the very end for the weird behaviour and question

These C# methods are called from C++ via COM:

[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
[Guid("...")]
public interface IInterface
{
    void Start(ICallback callback);
    void Stop();
}

[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
[Guid("...")]
public interface ICallback
{
    void Message(string message);
}

[Guid("...")]
public class MyInterface : IInterface
{
    private Task task;
    private CancellationTokenSource cancellation;
    ICallback callback;
    public void Start(ICallback callback)
    {
        Console.WriteLine("STARTING");
        this.callback = callback;
        this.cancellation = new CancellationTokenSource();
        this.task = Task.Run(() => DoWork(), cancellation.Token);
        Console.WriteLine("Service STARTED");
    }

    private void DoWork()
    {
        int i = 0;
        while (!cancellation.IsCancellationRequested)
        {
            Task.Delay(1000, cancellation.Token).Wait();
            Console.WriteLine("Starting iteration... {0}", i);
            //callback.Message($"Message {0} reported");
            Console.WriteLine("...Ending iteration {0}", i++);
        }
        Console.WriteLine("Service CANCELLED");
        cancellation.Token.ThrowIfCancellationRequested();
    }

    public void Stop()
    {
        //cancellation.Cancel(); -- commented deliberately for testing
        task.Wait();
    }

In C++ I provide an implementation of ICallback, CCallback:

#import "Interfaces.tlb" named_guids

class CCallback : public ICallback
{
public:
    //! \brief Constructor
    CCallback()
        : m_nRef(0)     {       }

    virtual ULONG STDMETHODCALLTYPE AddRef(void);
    virtual ULONG STDMETHODCALLTYPE Release(void);
    virtual HRESULT STDMETHODCALLTYPE QueryInterface(REFIID riid, void **ppvObject);

    virtual HRESULT __stdcall raw_Message(BSTR message)
    {
        std::wstringstream ss;
        ss << "Received: " << message;
        wcout << ss.str() << endl;
        return S_OK;
    }

private:
    long m_nRef;
};

My C++ calling code is basically:

    CCallback callback;
    IInterface *pInterface = GetInterface();
    cout << "Hit Enter to start" << endl;
    getch();
    hr = pInterface->Start(&callback);
    cout << "Hit Enter to stop" << endl;
    getch();
    pInterface->Stop();
    cout << "Hit Enter to exit" << endl;
    getch();
    pInterface->Stop();

This is a contrived example to avoid posting huge lumps of code but you can see the idea is the C# code is supposed to loop once a second, calling a C++ method which prints the message.

If I leave this line commented: //callback.Message($"Message reported at {System.DateTime.Now}"); it works exactly as one would imagine. If I uncomment it then what happens is:

    CCallback callback;
    IInterface *pInterface = GetInterface();
    cout << "Hit Enter to start" << endl;
    getch();
    hr = pInterface->Start(&callback);

STARTING

Starting iteration... 0

    cout << "Hit Enter to stop" << endl;
    getch();
    pInterface->Stop();

Received: Message 0 reported

...Ending iteration 0

Starting iteration... 1

Received: Message 1 reported

...Ending iteration 1

(... and so on.)

    cout << "Hit Enter to exit" << endl;
    getch();
    return;

Conclusion

So for some reason the call callback.Message is stalling my Task, until Task.Wait is called. Why on earth would this be? How does it get stuck and how does waiting on the task release it? My assumption is the threading model via COM means I have some sort of deadlock but can anyone be more specific?

I personally think running this all in a dedicated Thread is better but it's how an existing application works so I'm just really curious what is happening.

UPDATE

So I tested new Thread(DoWork).Start() Vs Task.Run(()=>DoWork()) and I get the exact same behaviour - it stalls now until Stop calls Thread.Join.

So I'm thinking COM for some reason is suspending the entire CLR or something along those lines.

Solution

It sounds like:

Your Callback implementation object is instantiated on an STA apartment thread (the main thread).
The task is running on separate thread that is either STA or MTA.
The interface call from the background thread is being marshaled back to the main thread.
There is no message pump in your main thread.
When task.Wait is called, it runs a loop that allows COM calls to be processed by the main thread.

You can verify this by checking the following:

You should be calling CoInitializeEx explicitly in your C++ client application's main thread. Check the threading model you're using there. If you're not calling that, add it. Does adding it fix your problem? I would expect not, but if it does, that means there's some interaction between COM and .NET that is probably by design but is tripping you up.
Add logs or setup a debugger so that you can see which threads are executing what pieces of code. Your code should be running in only two threads -- your main thread and one background thread. When you reproduce the problem condition, I believe you will see that the Message() method implementation is invoked on the main thread.
Replace your console application with a Windows application, or just run a message pump in your console application. I believe you will see that the hang does not occur.

I also have a guess at why Task.Wait and Thread.Join seem to unblock the calls, and also why you may be seeing this issue on a trimmed-down use case when you don't see it in the larger application.

Waiting in Windows is a funny beast. Instinctively, we imagine that Task.Wait or Thread.Join will block the thread entirely until the wait condition is satisfied. There are Win32 functions (e.g. WaitForSingleObject) that do exactly that, and simple I/O operations like getch do as well. But there are also Win32 functions that allow other operations to run while waiting (e.g. WaitForSingleObjectEx with bAlertable set to TRUE). In an STA, COM and .NET use the most complex wait function CoWaitForMultipleHandles, which runs a COM modal loop that processes the incoming messages for the calling STA. When you call this function or any function that uses it, any number of incoming COM calls and/or APC callbacks can execute before the wait condition is met or the function returns. (Aside: This is also true when you make a COM call from one thread to a COM object in a different apartment -- callers from other apartments can invoke into the calling thread before the calling thread's function call returns.)

As for why you're not seeing it in the full app, I would guess that the simplicity of your reduced use case is actually causing you more pain. The full app maybe has waits, message pumps, cross-thread calls or some other something that ends up allowing the COM calls through at sufficient times. If the full app is .NET, you should know that interoperation with COM is pretty fundamental to .NET, so it's not necessarily something you're doing directly that may be letting the COM calls through.