Search code examples
cassandracassandra-2.0cql3

How to delete a record in Cassandra?


I have a table like this:

CREATE TABLE mytable (
    user_id int,
    device_id ascii,
    record_time timestamp,
    timestamp timeuuid,
    info_1 text,
    info_2 int, 
    PRIMARY KEY (user_id, device_id, record_time, timestamp)
);

When I ask Cassandra to delete a record (an entry in the columnfamily) like this:

DELETE from my_table where user_id = X and device_id = Y and record_time = Z and timestamp = XX;

it returns without an error, but when I query again the record is still there. Now if I try to delete a whole row like this:

DELETE from my_table where user_id = X

It works and removes the whole row, and querying again immediately doesn't return any more data from that row.

What I am doing wrong? How you can remove a record in Cassandra?

Thanks


Solution

  • Ok, here is my theory as to what is going on. You have to be careful with timestamps, because they will store data down to the millisecond. But, they will only display data to the second. Take this sample table for example:

    aploetz@cqlsh:stackoverflow> SELECT id, datetime  FROM data;
    
     id     | datetime
    --------+--------------------------
     B25881 | 2015-02-16 12:00:03-0600
     B26354 | 2015-02-16 12:00:03-0600
    
    (2 rows)
    

    The datetimes (of type timestamp) are equal, right? Nope:

    aploetz@cqlsh:stackoverflow> SELECT id, blobAsBigint(timestampAsBlob(datetime)),
                                      datetime FROM data;
    
     id     | blobAsBigint(timestampAsBlob(datetime)) | datetime
    --------+-----------------------------------------+--------------------------
     B25881 |                           1424109603000 | 2015-02-16 12:00:03-0600
     B26354 |                           1424109603234 | 2015-02-16 12:00:03-0600
    
    (2 rows)
    

    As you are finding out, this becomes problematic when you use timestamps as part of your PRIMARY KEY. It is possible that your timestamp is storing more precision than it is showing you. And thus, you will need to provide that hidden precision if you will be successful in deleting that single row.

    Anyway, you have a couple of options here. One, find a way to ensure that you are not entering more precision than necessary into your record_time. Or, you could define record_time as a timeuuid.

    Again, it's a theory. I could be totally wrong, but I have seen people do this a few times. Usually it happens when they insert timestamp data using dateof(now()) like this:

    INSERT INTO table (key, time, data) VALUES (1,dateof(now()),'blah blah');