I have a Cassandra table like :-
create table test(imei text,dt_time timestamp, primary key(imei, dt_time)) WITH CLUSTERING ORDER BY (dt_time DESC);
Partition Key is: imei
Clustering Key is: dt_time
Now I want to store only most recent entry in this table(on the time basis) for each partition key. Let's say if I am inserting entry in a table where there will be single entry for each imei
Now let's say for an imei 98838377272 dt_time is 2017-12-23 16.20.12 Now for same imei if dt_time comes like 2017-12-23 15.20.00 Then this entry should not be inserted in that Cassandra table.
But if time comes like 2017-12-23 17.20.00 then it should get insert and previous row should get replaced with this dt_time.
You can use TIMESTAMP clause in your insert statement to mark data as most recent:
Marks inserted data (write time) with TIMESTAMP. Enter the time since epoch (January 1, 1970) in microseconds. By default, Cassandra uses the actual time of write.
Remove dt_time
from primary key to store only one entry for a imei
and
In this case, select by imei
will return record with the most recent timestamp (from point 1).
Please note, this approach will work if your dt_time
(which will be specified as timestamp) is less than the current time. In other words, select query will return records with most recent timestamp but before the current time. If you insert data with timestamp greater then the current time you will not see this data until this timestamp comes.