Understanding acquire-release example from book C++ concurrency in action

Could you, please, explain why assert can fire? I cannot understand the explanation below. Both if(y.load... and if(x.load... use std::memory_order_acquire. Is not this enough? In the book it's written that we need seq_cst for assert not to fire. How does seq_cst improve here?

#include <atomic>
#include <thread>
#include <assert.h>

std::atomic<bool> x,y;
std::atomic<int> z;

void write_x()
{
 x.store(true,std::memory_order_release);
}

void write_y()
{
 y.store(true,std::memory_order_release);
}

void read_x_then_y()
{
 while(!x.load(std::memory_order_acquire));
 if(y.load(std::memory_order_acquire)) 
  ++z;
}

void read_y_then_x()
{
 while(!y.load(std::memory_order_acquire));
 if(x.load(std::memory_order_acquire)) 
  ++z;
}
int main()
{
 x=false;
 y=false;
 z=0;

 std::thread a(write_x);
 std::thread b(write_y);
 std::thread c(read_x_then_y);
 std::thread d(read_y_then_x);

 a.join();
 b.join();
 c.join();
 d.join();

 assert(z.load()!=0); 
}

In this case the assert can fire (like in the relaxed-ordering case), because it’s possible for both the load of x and the load of y to read false. x and y are written by different threads, so the ordering from the release to the acquire in each case has no effect on the operations in the other threads.

Solution

See https://en.cppreference.com/w/cpp/atomic/memory_order

memory_order_seq_cst A load operation with this memory order performs an acquire operation, a store performs a release operation, and read-modify-write performs both an acquire operation and a release operation, plus a single total order exists in which all threads observe all modifications in the same order (see Sequentially-consistent ordering below).

Without memory_order_seq_cst different threads can see atomic operations happen in different orders. It's possible for read_x_then_y to see:

x = true;
y = true;

Whilst read_y_then_x could see:

y = true;
x = true;

In this case if both threads then test their second variable between the two writes neither will write z.

The atomic writes are likely happening on different CPU cores, the cores then have to communicate with each other about the writes, without memory_order_seq_cst they won't necessarily bother going to the extra effort of making sure that all cores have the same view of the order in which the writes occurred.