Search code examples
databasecassandracqlscylla

Cassandra: How to use 'ORDER BY' without PRIMARY KEY restricted by EQ or IN?


I have a table in Scylla (a Cassandra compatible database) defined as the following:

create table s.items (time timeuuid, name text, primary key (time));

I want to run a query that gets all items after a certain time, similar to the following:

select * from s.items where time>7e204790-43bf-11e9-9759-000000000004 order by time asc;

But I am told that ORDER BY is only supported when the partition key is restricted by an EQ or an IN. To get around this I can make a table and query similar to the following:

create table s.items (yes boolean, time timeuuid, name text, primary key (yes, time));

select * from s.items where yes=true and time>7e204790-43bf-11e9-9759-000000000004 order by time asc;

While this works, it doesn't seem like the best solution. As I'm fairly new to Scylla and CQL, is there a better/proper way to do this?


Solution

  • Your solution of adding that one boolean key and always setting it to yes basically creates one huge partition with all your data. This is rarely what you really want. If this one partition is your entire data, it means that even if you have a 10-node cluster with 8 CPUs on each node, just 3 CPUs out of all 80 in your cluster will be doing any work (because each partition belongs to a certain CPU, and with RF=3 there are three replicas).

    If you're wondering why your original solution didn't work and Scylla refused the "ORDER BY", well, the problem is that although Scylla can scan the entire table to look for entries after time X (you'll need to add 'ALLOW FILTERING' to the query), it has no efficient way to sort what it finds by time. Internally, the different partitions are not sorted by the partition key, but rather by a "token", a hash function of the the partition key. This hashing with its randomizing effect is important to balance the load between all CPUs on the cluster, but prevents Scylla (or Cassandra) from reading the partitions in the original key order.

    One thing you can do is to do what Alex suggested above, which is a middle-ground between your original setup and your proposed solution: Don't have one item per partition, or all the items in a single partition, but something in the middle: For example, imagine that in your workload, every day you collect 100MB of data. So you use the day number as the partition key (instead of your bool). All the data of one particular day will sit in one partition. Inside each day's partition, the different entries (rows) will be sorted by the clustering-key order, which will be time. With this setup, to retrieve all the items after some specific day, just start querying each individual day, one by one. E.g., query day 134, then day 135, they 136, then, etc... Inside each day, the results will be already sorted. So problem solved.

    This technique is a fairly well-known "time series" data modeling. Scylla (and Cassandra) even have a special compaction strategy tuned for this modeling, TWCS (time-window compaction strategy).