Lately, i've been reading about Database sharding and database concurrency control as well as ACID properties and i've been thinking about some scenarios that are a little bit tricky for me.
Suppose we want to have a transaction for money transfer from an Account to another. Suppose we have Customers (Accounts) sharded by Country, like US Customers on a specific server separated from European Customers (for the sake of scalability)
A transaction for such system should be something like:
BEGIN TRANSACTION
UPDATE FROM Account SET balance = balance - 100.0 WHERE id = 1;
UPDATE FROM Account SET balance = balance + 100.0 WHERE id = 2;
COMMIT;
1- Suppose that Account #1 is from Europe and Account #2 is from US. How ACID properties are going to be kept in this situation ? because from an application we would have a different session with each shard (separated database servers) which means different transactions !
2- This might also be a problem in detecting deadlocks, how could we detect for example deadlocks in a concurrent application if the above transaction would get executed by 2 different thread in a different order !
I know that this can be easily done if we had only 1 database holding all the records as it have total control over the data, but in distributed databases, i believe we maybe do need some communication with different databases or maybe a central agent to handle such cases !
What you need is consensus algorithm, so that account 1 and account 2 both can agree that $100 transaction happened between them.
This is pretty vast topic with people having PHDs only studying consensus algorithm. List of famous consensus algorithms are:
I will start with 2 phase commit (simplest of all 3 above) if bar for extreme performance can be lowered a bit.