Search code examples
cassandracql

retrieve rows by date: cassandra


In the data there is one column timestamp with pattern:

YEAR-MM-DD hour:minute:second

I want to retrieve rows where this timestamp column date e.g. is > 2015-07-01

I dont want to compare time, just date. I tried:

SELECT * FROM table_name where timestamp >= '2015-10-01' limit 10;

But does not give desired output.


Solution

  • One problem here, is that Cassandra requires an EQ condition (IN also works, but I don't recommend that) on the 1st partition key before allowing any other types of conditions (<, >, etc). Other solutions like using ALLOW FILTERING or an index may "work," but they will not perform at scale, and also will not return the rows in any kind of order.

    Cassandra can only enforce a sort order within a partition key. For example, this means that you can select where rows with a certain date/times value in their first clustering column, given a certain, specific value for the partition key.

    Example:

    CREATE TABLE stackoverflow.timestamptest (
        userid text,
        activetime timestamp,
        value text,
        PRIMARY KEY (userid, activetime)
    ) WITH CLUSTERING ORDER BY (activetime ASC)
    
    aploetz@cqlsh:stackoverflow> SELECT * FROm timestamptest 
                                 WHERE userid='a' AND activetime > '2015-09-25';                
     userid | activetime               | value
    --------+--------------------------+---------
          a | 2015-09-25 06:33:33-0500 |  value1
          a | 2015-09-25 06:34:33-0500 |  value2
          a | 2015-10-22 09:26:00-0500 |  value3
    
    (3 rows)
    

    Basically, you need to further partition your table in a way that makes sense for your application and query requirements.

    This tends to be a confusing issue for many Cassandra dev, so I have included a link to an article that I wrote for DataStax earlier this year: We Shall Have Order!