Search code examples
cassandracqlcql3cqlsh

Updating Table In Cassandra


I have a following table structure in Cassandra :

CREATE TABLE ssession (
  sessionid text PRIMARY KEY,
  session_start_time timestamp,
  updated_time timestamp
);

session_start_time is time when a particular session becomes active and update_time is time till the user will be doing some activity.Here, sessionid and session_start_time will be inserted once and the updated_time keeps updating as the user is active.

I want to include only sessionid as the primary key.

Normal update statement will be :

UPDATE ssession SET session_start_time  = '2015-07-31 10:43:13+0530',
updated_time = '2015-07-31 10:43:13+0530' WHERE sessionid = '22_865624098';

Here, first time I'll insert same session_start_time and updated_time .But from next time I'll have to only update the updated_time.

And I need a single query to do so.Since, I'll be getting data continuously (using storm to process the data).

Is there any way to achieve this ?


Solution

  • When you INSERT or UPDATE data (updates and inserts are the same in cassandra) you do not need to provide all columns. If you just want to update the updated_time, your query should be:

    UPDATE ssession SET updated_time = '2015-07-31 10:43:13+0530' WHERE sessionid = '22_865624098';
    

    But it sounds like you want to make sure that session_start_time is set the first time that sessionid is created and only the first time, correct?

    What you could make use of lightweight transactions and if not exists to create the data with session_start_time. If there is already a row with that session_id, the insert will not be applied:

    INSERT INTO ssession (sessionid, session_start_time, updated_time) values ('22_865624098', '2015-07-31 10:43:13+0530', '2015-07-31 10:43:13+0530') if not exists;
    

    Cassandra returns a column [applied] in this case with a value of true or false if the insert was applied. If false is returned, you can then simply run an update query that only updates updated_time:

    UPDATE ssession set updated_time = '2015-07-31 10:43:14+0531' where sessionid = '22_865624098';
    

    Note that lightweight transactions introduce some performance cost which is detailed in the article I linked above. It uses 'SERIAL' consistency level which is a multi-phase QUORUM. This is also a 'read-then-write' pattern, which is not going to be as fast as blindly writing the data. You should test the performance of this solution and see if it is adequate for you.