Search code examples
databasegoconcurrencykey-value-store

How are keys locked in KV Stores?


I am building a distributed KV store just to learn a little bit more about distributed systems and concurrency. The implementation of the KV store I am building is completely transactional, with an in-memory transaction log. The storage is also completely in-memory just for simplicity. The API exposes the GET, INSERT, UPDATE, REMOVE. Note that all endpoints operate on a single key, not a range of keys.

I am managing concurrency via locks. However, I have a single global lock that locks the entire datastore. This sound terribly inefficient because if I want to read the value for K1 while I update K2, I must wait for k2 to finish updating despite being unrelated.

I know that there are DBs that use more granular locking. For example, in MySQL servers there are row-level locks. How can key-level locks be implemented?

I have

type Storage struct {
  store map[string]int32
}

Should I add something like this?:

type Storage struct {
  store map[string]int32
  locks map[string]mutex.Lock
}

The issue if I do this is that locks has to be kept in sync with the store. Another option would be to combine the two maps, but even then I come into the issue of deleting an entry in the map while the lock is being held if a REMOVE request comes before a GET.


Solution

  • Conceptual part

    Transactions

    First, transaction logs are not needed for strong consistency. Transaction logs are useful for upholding ACID properties.

    Transactions are also not strictly required for strong consistency in a database, but they can be a useful tool for ensuring consistency in many situations.

    Strong consistency refers to the property that ensures that all reads of a database will return the most recent write, regardless of where the read operation is performed. In other words, strong consistency guarantees that all clients will see the same data, and that the data will be up-to-date and consistent across the entire system.

    You can use a consensus algorithm, such as Paxos or Raft, to assure strong consistency. When storing data, you can store data with a version, and use that as the ID in Paxos.

    Locking in KV Stores

    In a key-value (KV) store, keys are typically locked using some kind of locking mechanism, such as a mutex or a reader-writer lock (as suggested by @paulsm4). This allows multiple threads or processes to access and modify the data in the KV store concurrently, while still ensuring that the data remains consistent and correct.

    For example, when a thread or process wants to read or modify a particular key in the KV store, it can acquire a lock for that key. This prevents other threads or processes from concurrently modifying the same key, which can lead to race conditions and other problems. Once the thread or process has finished reading or modifying the key, it can release the lock, allowing other threads or processes to access the key.

    The specific details of how keys are locked in a KV store can vary depending on the implementation of the KV store. Some KV stores may use a global lock (as you were already doing, which is sometimes inefficient) that locks the entire data store, while others may use more granular locking mechanisms, such as row-level or key-level locks, to allow more concurrent access to the data.

    So tldr; conceptually, you were right. The devil is in the details of the implementation of the locking.

    Coding

    To strictly answer the question about locking, one can consider Readers-writers locks as suggested by @paulsm4. In golang, a similar lock is RWMutex. It is used in sync.Map.

    Here is a short example:

    type Storage struct {
      store sync.Map // a concurrent map
    }
    
    // GET retrieves the value for the given key.
    func (s *Storage) GET(key string) (int32, error) {
      // Acquire a read lock for the key.
      v, ok := s.store.Load(key)
      if !ok {
        return 0, fmt.Errorf("key not found: %s", key)
      }
    
      // Return the value.
      return v.(int32), nil
    }
    
    // INSERT inserts the given key-value pair into the data store.
    func (s *Storage) INSERT(key string, value int32) error {
      // Acquire a write lock for the key.
      s.store.Store(key, value)
      return nil
    }
    
    // UPDATE updates the value for the given key.
    func (s *Storage) UPDATE(key string, value int32) error {
      // Acquire a write lock for the key.
      s.store.Store(key, value)
      return nil
    }
    
    // REMOVE removes the key-value pair for the given key from the data store.
    func (s *Storage) REMOVE(key string) error {
      // Acquire a write lock for the key.
      s.store.Delete(key)
      return nil
    }
    

    You would need Paxos on top of this to ensure consistency across replicas.