Search code examples
cassandrasql-order-byclustering-key

Cassandra clustering key order


So I have a table that looks along the lines of:

CREATE TABLE schema.table (
  partitionkey int,
  clusterkey text,
  value bigint,
  PRIMARY KEY ((partitionkey), clusterkey)
);

We rely on this table to sort correctly by the clusterkey for pagination.
Problem is: when returning results back from cassandra, it looks like they are sorted based on their ASCII value rather than a logical A-Z sorting. - which makes sense programatically but not logically for a person viewing it.

So a query of:
SELECT clusterkey FROM schema.table WHERE partitionkey = 1 ORDER BY clusterkey ASC;
get the results on the left, when I expect something like the one on the right

---------(current)---                ----------(wanted)---
          clusterkey                           clusterkey 
---------------------                ---------------------
       Other capital                      a should be top 
     Starts capital1                        Other capital 
     Starts capital2                      Starts capital1 
     Starts capital3                      starts capital1 
     Starts capital4                      Starts capital2 
        YYYYYYYYYYYY                      Starts capital3
        ZZZZZZZZZZZZ                      Starts capital4 
     a should be top                         YYYYYYYYYYYY 
     starts capital1                         ZZZZZZZZZZZZ 
        zzzzzzzzzzzz                         zzzzzzzzzzzz

I understand that we could change the data to be all lowercase / UPPERCASE to sort properly but this would change the look of the data. - Which is definitely not wanted.

Is there an option to change the method of the current clustering order?
- or another way to sort logically?


Solution

  • Clustering columns are sorted in the lexicographic order which means uppercase characters have precedence over lowercase.

    Case-insensitive ordering is not supported by Cassandra in current releases but there is an enhancement requested in Cassandra community Provide a locale/collation-aware text comparator so far it didn't get much attention from the developer community.