Search code examples
javaredisspring-dataqueuelettuce

REDIS LIST MOVE recovery/rollback strategy


I have the following scenario:

  1. Redis SENTINEL with one main REDIs and two replicas
  2. Redis Connection timeout configured to 30 seconds
  3. Lettuce's Java REDIS client
  4. Spring Data on top of Lettuce's REDIS client

I have a REDIS LIST which I am using as a QUEUE. As expected, each element on the queue must be processed by one and only one element processor. However, in some weird corner cases, I sometimes experience THIS:

-> BLMOVE A B 8
     (*) I get a timeout here after 30 seconds

(this is, move from list A to list B and block for 8 seconds

The problem I have is that the ELEMENT actually moves from list A to list B and I still get the timeout, so after this exception the element gets lost in queue B and never gets processed. I understand that this might be a network issue with the REDIS CONNECTION.

My question is: What's the best way to handle cases like this? Is there a way to "recover" the unprocessed element and try again?


Solution

  • There are a few ways to solve this issue:

    1. Use a REDIS SENTINEL with failover capabilities. This way, if the primary REDIS instance goes down, the sentinel will automatically failover to the replica instance and your elements will not be lost.

    2. Use a REDIS CLUSTER. This will provide you with high availability and will ensure that your elements are not lost in the event of a failure.

    3. Use a REDIS Replication setup. This will provide you with a hot standby REDIS instance that can take over if the primary instance goes down.

    4. Use a REDIS Persistence setup. This will ensure that your elements are not lost in the event of a failure.

    5. Use a REDIS Backup setup. This will ensure that your elements are not lost in the event of a failure.