Search code examples
c++multithreadingvisual-c++waitcondition-variable

How is CONDITION_VARIABLE implemented?


A longer version of the title question would be:

On my machine, sizeof(std::condition_variable) is 72 bytes. What are these 72 bytes used for?

Note: The size of std::condition_variable depends on the implementation. Some examples sizes are given in Appendix A.

To understand how std::condition_variable works, I am satisfied to understand wait, notify_one, and member objects. I will start with wait. wait with a predicate is given below.

    template <class _Predicate>
    void wait(unique_lock<mutex>& _Lck, _Predicate _Pred) { // wait for signal and test predicate
        while (!_Pred()) {
            wait(_Lck);
        }
    }

The above wait calls the no-predicate wait.

    void wait(unique_lock<mutex>& _Lck) { // wait for signal
        // Nothing to do to comply with LWG-2135 because std::mutex lock/unlock are nothrow
        _Cnd_wait(_Mycnd(), _Lck.mutex()->_Mymtx());
    }

This wait calls _Cnd_wait on _Mycnd(). _Cnd_wait is found here.

int _Cnd_wait(const _Cnd_t cond, const _Mtx_t mtx) { // wait until signaled
    const auto cs = static_cast<Concurrency::details::stl_critical_section_interface*>(_Mtx_getconcrtcs(mtx));
    _Mtx_clear_owner(mtx);
    cond->_get_cv()->wait(cs);
    _Mtx_reset_owner(mtx);
    return _Thrd_success; // TRANSITION, ABI: Always returns _Thrd_success
}

_Cnd_t is a pointer to a _Cnd_internal_imp_t .

using _Cnd_t = struct _Cnd_internal_imp_t*;

The struct _Cnd_internal_imp_t is defined here.

struct _Cnd_internal_imp_t { // condition variable implementation for ConcRT
    std::aligned_storage_t<Concurrency::details::stl_condition_variable_max_size,
        Concurrency::details::stl_condition_variable_max_alignment>
        cv;

    [[nodiscard]] Concurrency::details::stl_condition_variable_interface* _get_cv() noexcept {
        // get pointer to implementation
        return reinterpret_cast<Concurrency::details::stl_condition_variable_interface*>(&cv);
    }
};

I am now looking at the line cond->_get_cv()->wait(cs);. To understand this line, I need to see Concurrency::details::stl_condition_variable_interface's member wait function. This is a virtual function.

        class __declspec(novtable) stl_condition_variable_interface {
        public:
            virtual void wait(stl_critical_section_interface*)                   = 0;
            virtual bool wait_for(stl_critical_section_interface*, unsigned int) = 0;
            virtual void notify_one()                                            = 0;
            virtual void notify_all()                                            = 0;
            virtual void destroy()                                               = 0;
        };

Edit 2

cond->_get_cv() is a pointer to an abstract class stl_condition_variable_interface. At some point during construction, create_stl_condition_variable will be called to set the virtual pointer. The virtual pointer for this object will point to the vtable for either stl_condition_variable_vista given here or stl_condition_variable_win7 given here. The top answer to this stack overflow question explains some of the details.

In my case, the virtual pointer points to the table for stl_condition_variable_win7.

        class stl_condition_variable_win7 final : public stl_condition_variable_interface {
        public:
            stl_condition_variable_win7() {
                InitializeConditionVariable(&m_condition_variable);
            }

            ~stl_condition_variable_win7()                                  = delete;
            stl_condition_variable_win7(const stl_condition_variable_win7&) = delete;
            stl_condition_variable_win7& operator=(const stl_condition_variable_win7&) = delete;

            void destroy() override {}

            void wait(stl_critical_section_interface* lock) override {
                if (!stl_condition_variable_win7::wait_for(lock, INFINITE)) {
                    std::terminate();
                }
            }

            bool wait_for(stl_critical_section_interface* lock, unsigned int timeout) override {
                return SleepConditionVariableSRW(&m_condition_variable,
                           static_cast<stl_critical_section_win7*>(lock)->native_handle(), timeout, 0)
                    != 0;
            }

            void notify_one() override {
                WakeConditionVariable(&m_condition_variable);
            }

            void notify_all() override {
                WakeAllConditionVariable(&m_condition_variable);
            }

        private:
            CONDITION_VARIABLE m_condition_variable;
        };

So my 72 or 8 bytes are reserved to store a CONDITION_VARIABLE and the essense of wait is to call SleepConditionVariableSRW. This function is described here.

END EDIT 2

Appendix A

The only member object of std::condition_variable is

aligned_storage_t<_Cnd_internal_imp_size, _Cnd_internal_imp_alignment> _Cnd_storage;

std::condition_variable contains the below member function which allows _Cnd_storage to be interpreted as a _Cnd_t.

    _Cnd_t _Mycnd() noexcept { // get pointer to _Cnd_internal_imp_t inside _Cnd_storage
        return reinterpret_cast<_Cnd_t>(&_Cnd_storage);
    }

sizeof(std::condition_variable) is given by the sizeof(_Cnd_storage), which is defined in xthreads.h.

// Size and alignment for _Mtx_internal_imp_t and _Cnd_internal_imp_t
#ifdef _CRT_WINDOWS
#ifdef _WIN64
_INLINE_VAR constexpr size_t _Mtx_internal_imp_size      = 32;
_INLINE_VAR constexpr size_t _Mtx_internal_imp_alignment = 8;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_size      = 16;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_alignment = 8;
#else // _WIN64
_INLINE_VAR constexpr size_t _Mtx_internal_imp_size      = 20;
_INLINE_VAR constexpr size_t _Mtx_internal_imp_alignment = 4;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_size      = 8;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_alignment = 4;
#endif // _WIN64
#else // _CRT_WINDOWS
#ifdef _WIN64
_INLINE_VAR constexpr size_t _Mtx_internal_imp_size      = 80;
_INLINE_VAR constexpr size_t _Mtx_internal_imp_alignment = 8;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_size      = 72;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_alignment = 8;
#else // _WIN64
_INLINE_VAR constexpr size_t _Mtx_internal_imp_size      = 48;
_INLINE_VAR constexpr size_t _Mtx_internal_imp_alignment = 4;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_size      = 40;
_INLINE_VAR constexpr size_t _Cnd_internal_imp_alignment = 4;
#endif // _WIN64
#endif // _CRT_WINDOWS

Edit 1/Appendix B

I thought about this after posting the question, and I am not sure how to make it flow with the rest. std::condition_variable's only member is

aligned_storage_t<_Cnd_internal_imp_size, _Cnd_internal_imp_alignment> _Cnd_storage;

which is interpreted as _Cnd_internal_imp_t. _Cnd_internal_imp_t's only member is

std::aligned_storage_t<Concurrency::details::stl_condition_variable_max_size, Concurrency::details::stl_condition_variable_max_alignment> cv;

It is possible that stl_condition_variable_max_size != _Cnd_internal_imp_size. In fact, this implied in this line

static_assert(sizeof(_Cnd_internal_imp_t) <= _Cnd_internal_imp_size, "incorrect _Cnd_internal_imp_size");

This would mean that it is possible that some of the 72 bytes are "unused."

END EDIT 1

Questions:

  1. std::condition_variable reserves 72 bytes for a CONDITION_VARIABLE (see Edit 2). What are these 72 bytes used for?
  2. How could a std::condition_variable get away with fewer bytes? It appears as though on some machines std::condition_variables are only 8 bytes big. See:
    _INLINE_VAR constexpr size_t _Cnd_internal_imp_size      = 8;
    

Solution

  • std::condition_variable reserves 72 bytes for a CONDITION_VARIABLE (see Edit 2). What are these 72 bytes used for?

    There was another implementation of condition variable that was backed by Concurrency Runtime (ConcRT). In Visual Studio 2012 it was the only implementation, but it turned out to be not very good.

    Starting from VS 2015, there is better implementation backed by the actual CONDITION_VARIABLE. There is a polymorphism to create different implementations for different Windows versions, as CONDITION_VARIABLE is available starting Windows Vista, and a complete SRWLOCK is available starting in Windows 7. The polymorphism uses placement new rather than unions to hide the implementation details and to make the implementation conformant by making it a standard-layout class.

    So, there is a place for multiple implementations, out of which the ConcRT is the largest.

    Otherwise, sizeof(CONDITION_VARIABLE) == sizeof(void*), as well as sizeof(SRWLOCK) == sizeof(void*), though they aren't pointers internally. The rest of the size is wasted, if CONDITION_VARIABLE / SRWLOCK implementation is used.

    Starting from Visual Studio 2019, Windows XP is no longer supported by the VS toolset (it is supported by VS 2019 by the ability to install VS 2017 toolset). So ConcRT dependency and the ability to create pre-Vista condition_variable was removed by my PR. A follow-up PR removed ConcRT structure wrappers.

    Starting from Visual Studio 2022, Windows Vista is no longer supported by the VS toolset either, my other PR to remove the SRWLOCK polymorphism is in flight.

    Still due to the ABI compatibility between VS 2015, VS 2017, VS 2019, and VS 2022, it is not possible to reduce the size of condition_variable.

    Getting rid of placement new in mutex constructor and fixing the conformance issue with having mutex constructor non-constexpr is also hard (my attempt has failed).

    So, VS 2019 and VS 2022 still have to reserve space for the ConcRT implementation, which is no longer used.

    With the next ABI breaking release of Visual Studio it is highly likely that the implementation of condition_variable will change.


    How could a std::condition_variable get away with fewer bytes?

    _CRT_WINDOWS implementation never needed to support Windows XP, so does not have ConcRT fallback. Still it shares the implementation with the usual configuration, apparently for maintenance reasons.