c++c++11 thread-safety atomic compare-and-swap

C++11 Lock-free sequence number generator safe?

The goal is to implement a sequence number generator in modern C++. The context is in a concurrent environment.

Requirement #1 The class must be singleton (common for all threads)

Requirement #2 The type used for the numbers is 64-bit integer.

Requirement #3 The caller can request more than one numbers

Requirement #4 This class will cache a sequence of numbers before being able serve the calls. Because it caches a sequence, it must also store the upper bound -> the maximum number to be able to return.

Requirement #5 Last but not least, at startup (constructor) and when there are no available numbers to give ( n_requested > n_avalaible ), the singleton class must query the database to get a new sequence. This load from DB, updates both seq_n_ and max_seq_n_.

A brief draft for its interface is the following:

class singleton_sequence_manager {

public:

    static singleton_sequence_manager& instance() {
        static singleton_sequence_manager s;
        return s;
    }

    std::vector<int64_t> get_sequence(int64_t n_requested);

private:
    singleton_sequence_manager(); //Constructor
    void get_new_db_sequence(); //Gets a new sequence from DB

    int64_t seq_n_;
    int64_t max_seq_n_;
}

Example just to clarify the use case. Suppose that at startup, DB sets seq_n_ to 1000 and max_seq_n_ to 1050:

get_sequence.(20); //Gets [1000, 1019]
get_sequence.(20); //Gets [1020, 1039]
get_sequence.(5); //Gets [1040, 1044]
get_sequence.(10); //In order to serve this call, a new sequence must be load from DB

Obviously, an implementation using locks and std::mutex is quite simple.

What I am interested into is implementing a lock-free version using std::atomic and atomic operations.

My first attempt is the following one:

int64_t seq_n_;
int64_t max_seq_n_;

were changed to:

std::atomic<int64_t> seq_n_;
std::atomic<int64_t> max_seq_n_;

Get a new sequence from DB just sets the new values in the atomic variables:

void singleton_sequence_manager::get_new_db_sequence() {
    //Sync call is made to DB
    //Let's just ignore unhappy paths for simplicity
    seq_n_.store( start_of_seq_got_from_db );
    max_seq_n_.store( end_of_seq_got_from_db );
    //At this point, the class can start returning numbers in [seq_n_ : max_seq_n_]
}

And now the get_sequence function using atomic compare and swap technique:

std::vector<int64_t> singleton_sequence_manager::get_sequence(int64_t n_requested) {

    bool succeeded{false};
    int64_t current_seq{};
    int64_t next_seq{};

    do {

        current_seq = seq_n_.load();
        do {
            next_seq = current_seq + n_requested + 1;
        }
        while( !seq_n_.compare_exchange_weak( current_seq, next_seq ) );
        //After the CAS, the caller gets the sequence [current_seq:next_seq-1]

        //Check if sequence is in the cached bound.
        if( max_seq_n_.load() > next_seq - 1 )
            succeeded = true;
        else //Needs to load new sequence from DB, and re-calculate again
            get_new_db_sequence();

    }        
    while( !succeeded );

    //Building the response        
    std::vector<int64_t> res{};
    res.resize(n_requested);
    for(int64_t n = current_seq ; n < next_seq ; n++)
        res.push_back(n);

    return res;
}

Thoughts:

I am really concerned for the lock-free version. Is the implementation safe ? If we ignore the DB load part, obviously yes. The problem arises (in my head at least) when the class has to load a new sequence from the DB. Is the update from DB safe ? Two atomic stores ?
My second attempt was to combine both seq_n_ and max_seq_n_ into a struct called sequence and use a single atomic variable std::atomic but the compiler failed. Because the size of the struct sequence is greater than 64-bit.
Is it possible to somehow protect the DB part by using an atomic flag for marking if the sequence is ready yet: flag set to false while waiting the db load to finish and to update both atomic variables. Therefore, get_sequence must be updated in order to wait for flag to bet set to true. (Use of spin lock ?)

Solution

Your lock-free version has a fundamental flaw, because it treats two independent atomic variables as one entity. Since writes to seq_n_ and max_seq_n_ are separate statements, they can be separated during execution resulting in the use of one of them with a value that is incorrect when paired with the other.

For example, one thread can get past the CAS inner while loop (with an n_requested that is too large for the current cached sequence), then be suspended before checking if it is cached. A second thread can come thru and update the max_seq_n value to a larger value. The first thread then resumes, and passes the max_seq_n check because the value was updated by the second thread. It is now using an invalid sequence.

A similar thing can happen in get_new_db_sequence between the two store calls.

Since you're writing to two distinct locations (even if adjacent in memory), and they cannot be updated atomically (due to the combined size of 128 bits not being a supported atomic size with your compiler), the writes must be protected by a mutex.

A spin lock should only be used for very short waits, since it does consume CPU cycles. A typical usage would be to use a short spin lock, and if the resource is still unavailable use something more expensive (like a mutex) to wait using CPU time.