Search code examples
cachingservicearchitectureserverless

Drawbacks of using redis cache for primary store and then syncing to a DB later?


I'm writing an application and experimenting with serverless architecture and one of my thoughts is that the functions should execute reasonably quickly, which also acts as a money saving benefit from this an idea I had was regarding persistence of data to keep function execution times short.

The function itself is simple, it takes 'cards' with text on and persists them. The application in particular will likely have these cards being created frequently, or being edited very frequently and in large quantities.

Given this assumption, I figured rather than the function writing directly to a DB, it seems like it would make more sense to have the function write to a cache. Then later on, some kind of sync application could exist that takes data in the cache and persists it to the database. The synchronisation tool could be something that happens on a few machines that spin up and are dedicated to performing this, so can be done at a fixed rate rather than perhaps using some costly functions.

This seems like a better option, but I'm wondering what the drawbacks of doing this are versus using a DB. Some of my thoughts so far are:

  1. consistency issues will be tricky since the sync would have to happen frequently enough where items in the cache don't expire before they can be saved and a user session ends.
  2. bundling into a sync means that operations can be performed in bulk rather than many single transactions, which sounds like a good thing
  3. it seems easier to setup caching in general by having it as the main driver of the data source as opposed to storing data in the cache and figuring out when to cache, etc.

I'd also love to know what kind of things I can google to learn more about these concepts since at the moment these are ideas and I don't know the terminology to describe them and learn more about them. I'm sure this problem has been tackled many times before!


Solution

  • One of the main drawbacks in your scenario would be the complexity of the system - which means higher development (including this research phase) & maintenance costs. That is especially important in early stages of a project. Typically, you would focus on the main business functionality and leave such optimizations for later, when you see the business value in investing in such relatively small optimizations (i.e. you need a LOT of traffic to get back the initial costs + additional maintenance costs).

    If you see the business case in looking into this further, you should start with estimating the actual benefit. So you should check out some benchmarks - e.g. how fast could you get the writes on Redis vs. DynamoDB (optimized for this use case)?

    Next, if you expect to constantly have a TON of traffic - does the math say serverless is actually better for you? Maybe it's more cost effective to just have auto-scaled instances up and running with local in-memory cache, which gets dumped to DB in chunks (instead of using Redis of the same idea)? Serverless is not always as cost-efficient as the nice adverts show.

    As you noted, consistency is a very important aspect to think about. Do you really need it? What happens if, e.g., two lamdbas want to update the same 'card'? Ideally, you should not care which update is handled first. If they could happen in any order (without an error) - that would give you the most options for getting better performance & lower costs.