String sorting in Cassandra CQL

When querying the text primary key in Cassandra CQL, the string comparison works the opposite way of what one expect, i.e.

cqlsh:test> select * from sl;

 name                     | data
--------------------------+------
 000000020000000000000003 | null
 000000010000000000000005 | null
 000000010000000000000003 | null
 000000010000000000000002 | null
 000000010000000000000001 | null

cqlsh:test> select name from sl where token(name) < token('000000010000000000000005');
name
--------------------------
 000000020000000000000003

(1 rows)

cqlsh:test> select name from sl where token(name) > token('000000010000000000000005');
 name
--------------------------
 000000010000000000000003
 000000010000000000000002
 000000010000000000000001

(3 rows)

In constrast, this is what I get from the string comparison in Python (and I think in most other languages):

>>>'000000020000000000000003' < '000000010000000000000005'
False

If I query without the token function, I get the following error:

cqlsh:test> select name from sl where name < '000000010000000000000005';
Bad Request: Only EQ and IN relation are supported on the partition key (unless you use the token() function)

The table description is:

CREATE TABLE sl (
  name text,
  data blob,
  PRIMARY KEY (name)
) WITH
  bloom_filter_fp_chance=0.010000 AND
  caching='KEYS_ONLY' AND
  comment='' AND
  dclocal_read_repair_chance=0.000000 AND
  gc_grace_seconds=864000 AND
  index_interval=128 AND
  read_repair_chance=0.100000 AND
  replicate_on_write='true' AND
  populate_io_cache_on_flush='false' AND
  default_time_to_live=0 AND
  speculative_retry='99.0PERCENTILE' AND
  memtable_flush_period_in_ms=0 AND
  compaction={'class': 'SizeTieredCompactionStrategy'} AND
  compression={'sstable_compression': 'LZ4Compressor'};

Is there an explanation in the docs I've missed or somewhere else on why such a strange string comparison order is chosen, or does the string comparison operator do not what I expect of it (i.e. returning me some unrelated order, i.e. the order of the rows as they are written to the database). I'm using the Murmur3Partitioner partitioner in case it matters.

Solution

In Cassandra rows are ordered by the hash of their key value. With the Random and Murmur3 partitioners there is a random element to the hash value, and thus the order is A) not meaningful, and B) designed to be evenly spread across the ring.

Therefore querying for tokens less than token('000000010000000000000005') won't do a comparison based-on the string value of "000000010000000000000005". It will do a comparison on the hashed token value. By virtue of the results that you are seeing, the token value of the string "000000020000000000000003" is less than the token value of "000000010000000000000005".

For more information, check through this doc from DataStax: Paging Through Unordered Partitioner Results.

Assuming that you want to be able to query your data by the value of "name", you could build a table kind of like this:

CREATE TABLE sl (
  type text,
  name text,
  data blob,
  PRIMARY KEY (type, name)
)

I've created type as a partitioning key. I'm not sure if your data makes sense to be divided by "type" (or anything else for that matter), so it's more for the sake of example than anything else. Anyway, with name as a clustering key (determining the on-disk sort order) this query would work:

select * from sl where type='sometype' AND name < '000000010000000000000005';

Again its just an example, but I hope that helps to point you in the right direction.