Why can mutex be used in different threads?

Using (writing) same variable in multiple threads simultaneously causes undefined behavior and crashes. Why using mutex, despite on fact that they are also variables, not causes undefined behavior?
If mutex somehow can be used simultaneously, why not make all variables work simultaneously without locking?

All my research is pressing Show definition on mutex::lock in Visual Studio, where I get at the end _Mtx_lock function without realization, and then I found it’s realization (Windows), though it has some functions also without realization:

 int _Mtx_lock(_Mtx_t mtx)
 {    /* lock mutex */
 return (mtx_do_lock(mtx, 0));
 }

 static int mtx_do_lock(_Mtx_t mtx, const xtime *target)
 {    /* lock mutex */
 if ((mtx->type & ~_Mtx_recursive) == _Mtx_plain)
     {    /* set the lock */
     if (mtx->thread_id != static_cast<long>(GetCurrentThreadId()))
         {    /* not current thread, do lock */
         mtx->_get_cs()->lock();
         mtx->thread_id = static_cast<long>(GetCurrentThreadId());
         }
     ++mtx->count;

     return (_Thrd_success);
     }
 else
     {    /* handle timed or recursive mutex */
     int res = WAIT_TIMEOUT;
     if (target == 0)
         {    /* no target --> plain wait (i.e. infinite timeout) */
         if (mtx->thread_id != static_cast<long>(GetCurrentThreadId()))
             mtx->_get_cs()->lock();
         res = WAIT_OBJECT_0;

         }
     else if (target->sec < 0 || target->sec == 0 && target->nsec <= 0)
         {    /* target time <= 0 --> plain trylock or timed wait for */
             /* time that has passed; try to lock with 0 timeout */
             if (mtx->thread_id != static_cast<long>(GetCurrentThreadId()))
                 {    /* not this thread, lock it */
                 if (mtx->_get_cs()->try_lock())
                     res = WAIT_OBJECT_0;
                 else
                     res = WAIT_TIMEOUT;
                 }
             else
                 res = WAIT_OBJECT_0;

         }
     else
         {    /* check timeout */
         xtime now;
         xtime_get(&now, TIME_UTC);
         while (now.sec < target->sec
             || now.sec == target->sec && now.nsec < target->nsec)
             {    /* time has not expired */
             if (mtx->thread_id == static_cast<long>(GetCurrentThreadId())
                 || mtx->_get_cs()->try_lock_for(
                     _Xtime_diff_to_millis2(target, &now)))
                 {    /* stop waiting */
                 res = WAIT_OBJECT_0;
                 break;
                 }
             else
                 res = WAIT_TIMEOUT;

             xtime_get(&now, TIME_UTC);
             }
         }
     if (res != WAIT_OBJECT_0 && res != WAIT_ABANDONED)
         ;

     else if (1 < ++mtx->count)
         {    /* check count */
         if ((mtx->type & _Mtx_recursive) != _Mtx_recursive)
             {    /* not recursive, fixup count */
             --mtx->count;
             res = WAIT_TIMEOUT;
             }
         }
     else
         mtx->thread_id = static_cast<long>(GetCurrentThreadId());

     switch (res)
         {
     case WAIT_OBJECT_0:
     case WAIT_ABANDONED:
         return (_Thrd_success);

     case WAIT_TIMEOUT:
         if (target == 0 || (target->sec == 0 && target->nsec == 0))
             return (_Thrd_busy);
         else
             return (_Thrd_timedout);

     default:
         return (_Thrd_error);
         }
     }
 }

So, according to this code, and the atomic_ keywords I think mutex can be written the next way:

atomic_bool state = false;

void lock()
{
if(!state)
    state = true;
else
    while(state){}
}

void unlock()
{
state = false;
}

bool try_lock()
{
if(!state)
   state = true;
else
   return false;

return true;
}

Solution

As you have found, std::mutex is thread-safe because it uses atomic operations. It can be reproduced with std::atomic_bool. Using atomic variables from multiple thread is not undefined behavior, because that is the purpose of those variables.

From C++ standard (emphasis mine):

The execution of a program contains a data race if it contains two conflicting actions in different threads, at least one of which is not atomic, and neither happens before the other. Any such data race results in undefined behavior.

Atomic variables are implemented using atomic operations of the CPU. This is not implemented for non-atomic variables, because those operations take longer time to execute and would be useless if the variables are only used in one thread.

Your example is not thread-safe:

void lock()
{
if(!state)
    state = true;
else
    while(state){}
}

If two threads are checking if(!state) simultaneously, it is possible that both enter the if section, and both threads believe they have the ownership:

Thread 1        Thread 2
if (!state)     
                if (!state)
                state=true;
state=true;

You must use an atomic exchange function to ensure that the another thread cannot come in between checking the value and changing it.

void lock()
{
    bool expected;
    do {
        expected = false;
    } while (!state.compare_exchange_weak(expected, true));
}

You can also add a counter and give time for other threads to execute if the wait takes a long time:

void lock()
{
    bool expected;
    size_t counter = 0;
    do {
        expected = false;
        if (counter > 100) {
            Sleep(10);
        }
        else if (counter > 20) {
            Sleep(5);
        }
        else if (counter > 3) {
            Sleep(1);
        }
        counter++;
    } while (!state.compare_exchange_weak(expected, true));
}