Search code examples
scylla

How to trim a list/set in ScyllaDB to a specific size?


Is there a way to trim a list/set to a specific size (in terms of number of elements)?

Something similar to LTRIM command on Redis (https://redis.io/commands/ltrim).

The goal is to insert an element to a list/set but ensuring that its final size is always <= X (discarding old entries).

Example of what I would like to be able to do:


CREATE TABLE images (
    name text PRIMARY KEY,
    owner text,
    tags set<text> // A set of text values
);

-- single command
UPDATE images SET tags = ltrim(tags + { 'gray', 'cuddly' }, 10) WHERE name = 'cat.jpg';

-- two commands (Redis style)
UPDATE images SET tags = tags + { 'gray', 'cuddly' } WHERE name = 'cat.jpg';
UPDATE images SET tags = ltrim(tags, 10) WHERE name = 'cat.jpg';


Solution

  • No, there is no such operation in Scylla (or in Cassandra).

    The first reason is efficiency: As you may be aware, one reason why writes in Scylla are so efficient is that they do not do a read: Appending an element to a list just writes this single item to a sequential file (a so-called "sstable"). It does not need to read the existing list and check what elements it already has. The operation you propose would have needed to read the existing item before writing, slowing it down significantly.

    The second reason is consistency: What happens if multiple operations like you propose are done in parallel, reaching different coordinators and replicas in different order? What happens if after earlier problems, one of the replicas is missing one of the values? There is no magic way to solve these problems, and the general solution that Scylla offers for concurrent Read-Modify-Write operations is LWT (Lightweight Transacations). You can emulate your ltrim operation using LWT but it will be significantly slower than ordinary writes. You will need to read the list to the client, modify it (append, ltrim, etc.) and then write it back with an LWT (with the extra condition that it still has its old value, or using an additional "version number" column).