Search code examples
selectcassandracqldatabase-partitioningdelta

Cassandra delta/relative querying


I have to orchestrate a batch which copy the delta of a table each day. This table is only written, never updated. I use java with jdbc and I wonder if there is a metadata or something on the table which can be queried to get all the rows added after a certains date.


Why a metadata ? Because with my table looking like :

CREATE TABLE aTable (
  aTable_id timeuuid,
  ...
  PRIMARY KEY ((aTable_id))
) WITH
...

I can't put the timeuuid key in the where clause like :

SELECT * FROM aTable WHERE aTable_id > minTimeuuid(?)

And the token function, even if the aTable_id is correctly ordered give me wrong results :

SELECT * FROM aTable WHERE token(aTable_id) > token(minTimeuuid(?))

In a nutshell, my question is : how to get aTables newer than a certain date ?


Solution

  • So I end up with a solution, found in a meetup introducing cassandra 3.0.

    Remember that the schema was set for another request and the keys were not set for a delta request.

    The aim for me was to query just the updated row from the previous batch and here the way I did :

    • Create a index table, partitioned on date hour (the minutes, seconds and millis are truncated). This table is fed by a gobal index from the main table.
    • In java, query the index by hour (loop on a calendar) and select the main table with a IN query.
    • Job done !