I'm trying to implement a protected variable that does not use locks in C++11. I have read a little about optimistic concurrency, but I can't understand how can it be implemented neither in C++ nor in any language.
The way I'm trying to implement the optimistic concurrency is by using a 'last modification id'. The process I'm doing is:
The problem I see is that, after comparing the 'last modification ids' (local copy and current one) and before commiting the changes, there is no way to assure that no other threads have modified the value of the protected variable.
Below there is a example of code. Lets suppose that are many threads executing that code and sharing the variable var
.
/**
* This struct is pretended to implement a protected variable,
* but using optimistic concurrency instead of locks.
*/
struct ProtectedVariable final {
ProtectedVariable() : var(0), lastModificationId(0){ }
int getValue() const {
return var.load();
}
void setValue(int val) {
// This method is not atomic, other thread could change the value
// of val before being able to increment the 'last modification id'.
var.store(val);
lastModificationId.store(lastModificationId.load() + 1);
}
size_t getLastModificationId() const {
return lastModificationId.load();
}
private:
std::atomic<int> var;
std::atomic<size_t> lastModificationId;
};
ProtectedVariable var;
/**
* Suppose this method writes a value in some sort of database.
*/
int commitChanges(int val){
// Now, if nobody has changed the value of 'var', commit its value,
// retry the transaction otherwise.
if(var.getLastModificationId() == currModifId) {
// Here is one of the problems. After comparing the value of both Ids, other
// thread could modify the value of 'var', hence I would be
// performing the commit with a corrupted value.
var.setValue(val);
// Again, the same problem as above.
writeToDatabase(val);
// Return 'ok' in case of everything has gone ok.
return 0;
} else {
// If someone has changed the value of var while trying to
// calculating and commiting it, return error;
return -1;
}
}
/**
* This method is pretended to be atomic, but without using locks.
*/
void modifyVar(){
// Get the modification id for checking whether or not some
// thread has modified the value of 'var' after commiting it.
size_t currModifId = lastModificationId.load();
// Get a local copy of 'var'.
int currVal = var.getValue();
// Perform some operations basing on the current value of
// 'var'.
int newVal = currVal + 1 * 2 / 3;
if(commitChanges(newVal) != 0){
// If someone has changed the value of var while trying to
// calculating and commiting it, retry the transaction.
modifyVar();
}
}
I know that the above code is buggy, but I don't understand how to implement something like the above in a correct way, without bugs.
Optimistic concurrency doesn't mean that you don't use the locks, it merely means that you don't keep the locks during most of the operation.
The idea is that you split your modification into three parts:
The main assumption here is that computation part is much more expensive that the commit part. If your modification is trivial and the computation cheap, then you can just use a lock, which is much simpler.
Some example code structured into these 3 parts could look like this:
struct Data {
...
}
...
std::mutex lock;
volatile const Data* value; // The protected data
volatile int current_value_version = 0;
...
bool modifyProtectedValue() {
// Initialize.
int version_on_entry = current_value_version;
// Compute the new value, using the current value.
// We don't have any lock here, so it's fine to make heavy
// computations or block on I/O.
Data* new_value = new Data;
compute_new_value(value, new_value);
// Commit or fail.
bool success;
lock.lock();
if (current_value_version == version_on_entry) {
value = new_value;
current_value_version++;
success = true;
} else {
success = false;
}
lock.unlock();
// Roll back in case of failure.
if (!success) {
delete new_value;
}
// Inform caller about success or failure.
return success;
}
// It's cleaner to keep retry logic separately.
bool retryModification(int retries = 5) {
for (int i = 0; i < retries; ++i) {
if (modifyProtectedValue()) {
return true;
}
}
return false;
}
This is a very basic approach, and especially the rollback is trivial. In real world example re-creating the whole Data object (or it's counterpart) would be likely infeasible, so the versioning would have to be done somewhere inside, and the rollback could be much more complex. But I hope it shows the general idea.