Search code examples
architecturescalabilitysystem-design

System design: Strategies for dealing with heavy writes to a DB


From a systems design/scalability perspective, what are some industry-standard strategies in dealing with a system that requires heavy writes to a particular table in a DB.

For simplicity sake, let's say the table is an inventory table for products, and has a column 'Product Name', and a column 'Count', and it simply increments by +1 each time a new Product is bought into the system. And there are millions of users buying different products every second and we have to keep track of the latest count of each product, but it does not have to be strictly realtime, maybe a 5 min lag is acceptable.

My options are:

  1. Primary-replica replication, where primary DB handles all writes, and replicas handles reads. But this doesn't address the write-heavy problem

  2. Sharding the DB based on product name range, or its hashed value. But what if there's a specific product (eg Apple) that receives large number of updates in a short time, it'll still hit the same DB.

  3. Batched updates? Use some kind of caching and write to table every X number of seconds with a cumulative counts of whatever we've received in those X seconds? Is that a valid option, and what caching mechanism do I use? And what if there's a crash between the last read and next write? How do I recover the lost count?

  4. Any other obvious choices I forgot about?

Any insight is appreciated!


Solution

  • I’d say a solution will be highly dependent of what exactly you need to do. A solution to write thousands of records per second might be very different from incrementing a counter in the example you provided. More so, there could be no tables at all to handle such load. Consistency/availability requirements are also missing in your question and depending on them the entire architecture may be very different.

    Anyway, back to your specific simplistic case and your options

    Option 1 (Master slave replication)

    The problem you’ll face here is database locking - every increment would require a record lock to avoid race conditions and you’ll quickly get your processes writing to your db waiting in a queue and your system down. Even under a moderate load )

    Option 2 (Sharding the DB)

    Your assumption is correct, not much different from p.1

    Option 3 (Batched updates)

    Very close. A caching layer provided by a light-weight storage providing concurrent atomic incremens/decrements with persistence not to lose your data. We’ve used redis for a similar purpose although any other key-value database would do as well - there are literally dozens of such databases around.

    A key-value database, or key-value store, is a data storage paradigm designed for storing, retrieving, and managing associative arrays, a data structure more commonly known today as a dictionary or hash table

    The solution would look as follows:

    incoming requests → your backend server -> kv_storage (atomic increment(product_id))
    

    And you'll have a "flushing" script running i.e. */5 that does the following (simplified):

    1. for every product_id in kv_storage read its current value
    2. update your db counter (+= value)
    3. decrement the value in kv_storage

    Further scaling

    • if the script fails nothing bad would happen - the updates would arrive on next run
    • if your backend boxes can't handle load - you can easily add more boxes
    • if a single key-value db can't handle load - most of them support scaling over multiple boxes or a simple sharding strategy in your backend scripts would work fine
    • if a single "flushing" script doesn't keep up with increments - you can scale them to multiple boxes and decide what key ranges are handled by each one