Search code examples
casynchronouswinapioverlapped-io

Race-free way to asynchronously start AND cancel I/O on another thread in Windows


Background: generally, if we want to force an operation to happen asynchronously (to avoid blocking the main thread), using FILE_FLAG_OVERLAPPED is insufficient, because the operation can still complete synchronously.

So let's say that, to avoid this, we defer the operation to a worker thread dedicated to I/O. This avoids blocking the main thread.

Now the main thread can use CancelIoEx(HANDLE, LPOVERLAPPED) to cancel the I/O operation initiated (say, via ReadFile) by the worker.

However, for CancelIoEx to succeed, the main thread needs a way to guarantee that the operation has in fact started, otherwise there is nothing to cancel.

The obvious solution here is to make the worker thread set an event after its call to e.g. ReadFile returns, but that now brings us back to the original problem: since ReadFile can block, we'll have defeated the entire purpose of having a worker thread in the first place, which was to ensure that the main thread isn't blocked on the I/O.

What's the "right" way to solve this? Is there a good way to actually force an I/O operation to happen asynchronously while still being able to request its cancellation later in a race-free manner when the I/O hasn't yet finished?

The only thing I can think of is to set a timer to periodically call CancelIoEx while the I/O hasn't completed, but that seems incredibly ugly. Is there a better/more robust solution?


Solution

  • you need in general do next:

    • every file handle which you use to asynchronous I/O incapsulate to some c/c++ object (let name it IO_OBJECT)

    • this object need have reference count

    • before start asynchronous I/O operation - you need allocate another object, which incapsulate OVERLAPPED or IO_STATUS_BLOCK (let name it IO_IRP) inside IO_IRP store referenced pointer to IO_OBJECT and specific io information - I/O code (read, write, etc) buffers pointers,..

    • check return code of I/O operation for determinate, are will be I/O callback (packet queued to iocp or apc) or if operation fail (will be no callback) - call callback by self just with error code

    • I/O manager save pointer which you pass to I/O in IRP structure (UserApcContext) and pass it back to you when I/O finished (if use win32 api this pointer equal pointer to OVERLAPPED in case native api - you can direct by self control this pointer)

    • when I/O finishid (if not synchronous fail at begin) - callback with final I/O status will be called

    • here you got back pointer to IO_IRP (OVERLAPPED) - call method of IO_OBJECT and release it reference, delete IO_IRP

    • if you at some point can close object handle early (not in destructor) - implement some run-down protection, for not access handle after close

    • run-down protection very similar to weak-refefence, unfortunatelly no user mode api for this, but not hard implement this byself

    from any thread, where you have pointer (referenced of course) to your object, you can call CancelIoEx or close object handle - if file have IOCP, when last handle to file is closed - all I/O operations will be canceled. however for close - you need not call CloseHandle direct but begin run-down and call CloseHandle when run-down completed (inside some ReleaseRundownProtection call (this is demo name, no such api)

    some minimal tipical implementation:

    class __declspec(novtable) IO_OBJECT 
    {
        friend class IO_IRP;
    
        virtual void IOCompletionRoutine(
            ULONG IoCode, 
            ULONG dwErrorCode, 
            ULONG dwNumberOfBytesTransfered, 
            PVOID Pointer) = 0;
        
        void AddRef();
        void Release();
    
        HANDLE _hFile = 0;
        LONG _nRef = 1;
        //...
    };
    
    
    class IO_IRP : public OVERLAPPED 
    {
        IO_OBJECT* _pObj;
        PVOID Pointer;
        ULONG _IoCode;
        
        IO_IRP(IO_OBJECT* pObj, ULONG IoCode, PVOID Pointer) : 
            _pObj(pObj), _IoCode(IoCode), Pointer(Pointer)
        {
            pObj->AddRef();
        }
        
        ~IO_IRP()
        {
            _pObj->Release();
        }
        
        VOID CALLBACK IOCompletionRoutine(
            ULONG dwErrorCode,
            ULONG dwNumberOfBytesTransfered,
            )
        {
            _pObj->IOCompletionRoutine(_IoCode, 
                dwErrorCode, dwNumberOfBytesTransfered, Pointer);
    
            delete this;
        }
    
        static VOID CALLBACK FileIOCompletionRoutine(
            ULONG status,
            ULONG dwNumberOfBytesTransfered,
            LPOVERLAPPED lpOverlapped
            )
        {
            static_cast<IO_IRP*>(lpOverlapped)->IOCompletionRoutine(
                RtlNtStatusToDosError(status), dwNumberOfBytesTransfered);
        }
    
        static BOOL BindIoCompletion(HANDLE hObject)
        {
            return BindIoCompletionCallback(hObject, FileIOCompletionRoutine, 0));
        }
        
        void CheckErrorCode(ULONG dwErrorCode)
        {
            switch (dwErrorCode)
            {
            case NOERROR:
            case ERROR_IO_PENDING:
                return ;
            }
            IOCompletionRoutine(dwErrorCode, 0);
        }
        
        void CheckError(BOOL fOk)
        {
            return CheckErrorCode(fOk ? NOERROR : GetLastError());
        }
    };
    
    
    ///// start some I/O // no run-downprotection on file
    
    if (IO_IRP* irp = new IO_IRP(this, 'some', 0))
    {
        irp->CheckErrorCode(ReadFile(_hFile, buf, cb, 0, irp));
    }
    
    ///// start some I/O // with run-downprotection on file
    
    if (IO_IRP* irp = new IO_IRP(this, 'some', 0))
    {
        ULONG dwError = ERROR_INVALID_HANDLE;
        
        if (AcquireRundownProtection())
        {
            dwError = ReadFile(_hFile, buf, cb, 0, irp) ? NOERROR : GetLastError();
            ReleaseRundownProtection();
        }
        
        irp->CheckErrorCode(dwError);
    }
    

    some more full implementation


    However, for CancelIoEx to succeed, the main thread needs a way to guarantee that the operation has in fact started, otherwise there is nothing to cancel.

    yes, despite you can safe call CancelIoEx at any time, even if no active I/O on file, by fact another thread can start new I/O operation already after you call CancelIoEx. with this call you can cancel current known single started operations. for instance - you begin conect ConnectEx and update UI (enable Cancel button). when ConnectEx finished - you post message to UI (disable Cancel button). if user press Cancel until I/O (ConnectEx) ative - you call CancelIoEx - as result connect will be canceled or finished normally bit early. in case periodic operations (for instance ReadFile in loop) - usually CancelIoEx not correct way for stop such loop. instead you need call CloseHandle from control thread - -which effective cancell all current I/O on file.


    about how ReadFile and any asynchronous I/O api work and are we can force faster return from api call.

    1. I/O manager check input parameter, convert handles (file handle to FILE_OBJECT) to pointers, check permissions, etc. if some error on this stage - error returned for caller and I/O finished
    2. I/O manager call driver. driver (or several drivers - top driver can pass request to another) handle I/O request (IRP) and finally return to I/O manager. it can return or STATUS_PENDING, which mean that I/O still not completed or complete I/O (call IofCompleteRequest) and return another status. any status other than STATUS_PENDING mean that I/O completed (with success, error or canceled, but completed)
    3. I/O mahager check for STATUS_PENDING and if file opened for synchronous I/O (flag FO_SYNCHRONOUS_IO ) begin wait in place, until I/O completed. in case file opened for asynchronous I/O - I/O manager by self never wait and return status for caller, including STATUS_PENDING

    we can break wait in stage 3 by call CancelSynchronousIo. but if wait was inside driver at stage 2 - impossible break this wait in any way. any Cancel*Io* or CloseHandle not help here. in case we use asynchronous file handle - I/O manager never wait in 3 and if api call wait - it wait in 2 (driver handler) where we can not break wait.

    as resutl - we can not force I/O call on asynchronous file return faster. if driver under some condition will be wait.

    and more - why we can not break driver wait, but can stop I/O manager wait. because unknown - how, on which object (or just Sleep), for which condition driver wait. what will be if we break thread wait before contidions meet.. so if driver wait - it will be wait. in case I/O manager - he wait for IRP complete. and for break this wait - need complete IRP. for this exist api, which mark IRP as canceled and call driver callback (driver must set this callback in case it return before complete request). driver in this callback complete IRP, this is awaken I/O manager from wait (again it wait only on synchrnous files) and return to caller

    also very important not confuse - end of I/O and end of api call. in case synchronous file - this is the same. api returned only after I/O completed. but for asynchronous I/O this is different things - I/O can still be active, after api call is return (if it return STATUS_PENDING or ERROR_IO_PENDING for win32 layer).

    we can ask for I/O complete early by cancel it. and usually (if driver well designed) this work. but we can not ask api call return early in case asynchronous I/O file. we can not control when, how fast, I/O call (ReadFile in concrete case) return. but can early cancel I/O request after I/O call (ReadFile) return . more exactly after driver return from 2 and because I/O manager never wait in 3 - can say that I/O call return after driver return control.


    if one thread use file handle, while another can close it, without any synchronization - this of course lead to raice and errors. in best case ERROR_INVALID_HANDLE can returned from api call, after another thread close handle. in worst case - handle can be reused after close and we begin use wrong handle with undefined results. for protect from this case need use handle only inside run-down protection (similar to convert weak reference to strong ). demo implementation:

    class IoObject
    {
        HANDLE _hFile = INVALID_HANDLE_VALUE;
        LONG _lock = 0x80000000;
    
    public:
        HANDLE LockHandle() 
        {
            LONG Value, PreviousValue;
    
            if (0 > (Value = _lock))
            {
                do 
                {
                    PreviousValue = InterlockedCompareExchangeNoFence(&_lock, Value + 1, Value);
    
                    if (PreviousValue == Value) return _hFile;
    
                } while (0 > (Value = PreviousValue));
            }
        
            return 0;
        }
    
        void UnlockHandle()
        {
            if (InterlockedDecrement(&_lock) == 0)
            {
                _hFile = 0; // CloseHandle(_hFile)
            }
        }
    
        void Close()
        {
            if (LockHandle())
            {
                _interlockedbittestandreset(&_lock, 31);
                UnlockHandle();
            }
        }
    
        void WrongClose()
        {
            _hFile = 0; // CloseHandle(_hFile)
        }
    
        BOOL IsHandleClosed()
        {
            return _hFile == 0;
        }
    };
    
    ULONG WINAPI WorkThread(IoObject* pObj)
    {
        ULONG t = GetTickCount();
        int i = 0x1000000;
        do 
        {
            if (HANDLE hFile = pObj->LockHandle())
            {
                SwitchToThread(); // simulate delay
    
                if (pObj->IsHandleClosed())
                {
                    __debugbreak();
                }
    
                pObj->UnlockHandle();
            }
            else
            {
                DbgPrint("[%x]: handle closed ! (%u ms)\n", GetCurrentThreadId(), GetTickCount() - t);
                break;
            }
        } while (--i);
    
        return 0;
    }
    
    ULONG WINAPI WorkThreadWrong(IoObject* pObj)
    {
        ULONG t = GetTickCount();
        int i = 0x10000000;
        do 
        {
            if (pObj->IsHandleClosed())
            {
                DbgPrint("[%x]: handle closed ! (%u ms)\n", GetCurrentThreadId(), GetTickCount() - t);
                break;
            }
            
            SwitchToThread(); // simulate delay
    
            if (pObj->IsHandleClosed())
            {
                __debugbreak();
            }
    
        } while (--i);
    
        return 0;
    }
    
    void CloseTest()
    {
        IoObject obj;
    
        ULONG n = 8;
        do 
        {
            if (HANDLE hThread = CreateThread(0, 0x1000, (PTHREAD_START_ROUTINE)WorkThread, &obj, 0, 0))
            {
                CloseHandle(hThread);
            }
        } while (--n);
    
        Sleep(50);
    //#define _WRONG_
    #ifdef _WRONG_
        obj.WrongClose();
    #else
        obj.Close();
    #endif
        MessageBoxW(0,0,0,0);
    }
    

    with WrongClose(); call we permanent will be catch __debugbreak() (use after close) in WorkThread[Wrong]. but with obj.Close(); and WorkThread we must never catch exception. also note that Close() is lock-free and caller of it never wait/hang even if api call inside rundown-protection will wait.