Why doesn't std::lock_guard/std::unique_lock use type erasure?

Why do std::lock_guard and std::unique_lock necessitate specifying the lock type as a template parameter?

Consider the following alternative. First, in a detail namespace, there are type erasure classes (a non-template abstract base class, and a template derived class):

#include <type_traits>
#include <mutex>
#include <chrono>
#include <iostream>

namespace detail {

    struct locker_unlocker_base {
        virtual void lock() = 0;
        virtual void unlock() = 0;
    };

    template<class Mutex>
    struct locker_unlocker : public locker_unlocker_base {
        locker_unlocker(Mutex &m) : m_m{&m} {}
        virtual void lock() { m_m->lock(); }
        virtual void unlock() { m_m->unlock(); }
        Mutex *m_m;
    };
}

Now te_lock_guard, the type erasure lock guard, simply placement-news an object of the correct type when constructed (without dynamic memory allocation):

class te_lock_guard {
public:
    template<class Mutex>
    te_lock_guard(Mutex &m) {
        new (&m_buf) detail::locker_unlocker<Mutex>(m);
        reinterpret_cast<detail::locker_unlocker_base *>(&m_buf)->lock();
    }
    ~te_lock_guard() {
        reinterpret_cast<detail::locker_unlocker_base *>(&m_buf)->unlock();
    }

private:
    std::aligned_storage<sizeof(detail::locker_unlocker<std::mutex>), alignof(detail::locker_unlocker<std::mutex>)>::type m_buf;
};

I've checked the performance vs. the standard library's classes:

int main() {
    constexpr std::size_t num{999999};
    {
        std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
        for(size_t i = 0; i < num; ++i) {
            std::mutex m;
            te_lock_guard l(m);
        }
        std::chrono::steady_clock::time_point end= std::chrono::steady_clock::now();
        std::cout << std::chrono::duration_cast<std::chrono::microseconds>(end - begin).count() << std::endl;
    }
    {
        std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
        for(size_t i = 0; i < num; ++i) {
            std::mutex m;
            std::unique_lock<std::mutex> l(m);
        }
        std::chrono::steady_clock::time_point end= std::chrono::steady_clock::now();
        std::cout << std::chrono::duration_cast<std::chrono::microseconds>(end - begin).count() << std::endl;
    }
}

Using g++ with -O3, there is no statistically-significant performance loss.

Solution

Because this complicates the implementation for no significant benefit whatsoever, and hides the fact that std::lock_guard and std::unique_lock are aware of the type of the lock they're guarding at compile-time.

Your solution is a workaround for the fact that class template parameter deduction does not happen during construction - this is addressed in the upcoming standard.

Necessitating to specify the lock type is annoying boilerplate that will be solved in C++17 (not only for lock guards) thanks to the Template parameter deduction for constructors (P0091R3) proposal.

The proposal (which was accepted), allows template parameters to be deduced from constructors, removing the need for make_xxx(...) helper functions or explicitly specify typenames that the compiler should be able to deduce:

// Valid C++17
for(size_t i = 0; i < num; ++i) {
    std::mutex m;
    std::unique_lock l(m);
}