Search code examples
cassandracqlcassandra-2.0cql3

CQL query in Cassandra with composite partition key


My main problem is with paginating Cassandra resultset on a table with a composite partition key. However, I am trying to narrow it down with a simple scenario. Say, I have a table,

CREATE TABLE numberofrequests (
  cluster text,
  date text,
  time text,
  numberofrequests int,
PRIMARY KEY ((cluster, date), time)
) WITH CLUSTERING ORDER BY (time ASC)

And I have a data like,


cluster | date | time | numberofrequests ---------+------------+------+------------------ c2 | 01/04/2015 | t1 | 1 c2 | d1 | t1 | 1 c2 | 02/04/2015 | t1 | 1 c1 | d1 | t1 | 1 c1 | d1 | t2 | 2

Question: Is there any way I can query data for cluster=c2? I don't care about the 'date' and honestly speaking I keep this only for partitioning purpose to avoid hot-spots. I tried the following,


 select * from numberofrequests where token(cluster,date)>=token('c2','00/00/0000');

 select * from numberofrequests where token(cluster,date)>=token('c2','1');

 select * from numberofrequests where token(cluster,date)>=token('c2','a');

 select * from numberofrequests where token(cluster,date)>=token('c2','');

My schema uses the default partitioner (Murmur3Partitioner). Is this achievable at all?


Solution

  • Cassandra needs the partitioning key (PK) to locate the queried row. Any queries based only on parts of the PK will not work, since its murmur3 hash won't match the hash based on the complete PK as initially created by the partitioner. What you could do instead is to use the ByteOrderedPartitioner. This would allow you to use the token() function as in your examples by keeping the byte order of the PK instead of using a hash function. But in most cases, that's a bad idea, as data will not be distributed evenly across the cluster and you'll end up with hotspots you tried to avoid in first place.