Search code examples
cassandrahector

Query columns based on datetime in Cassandra


We are trying to create/query information from a CF based on the following structure (e.g. a datetime, datetime, integer)

e.g.
03-22-2012 10.00, 03-22-2012 10.30 100
03-22-2012 10.30, 03-22-2012 11.00 50
03-22-2012 11.00, 03-22-2012 11.30 200

How do I model the above structure in Cassandra and perform the following queries via Hector.

select * from <CF> where datetime1 > 03-22-2012 10.00 and datetime2 < 03-22-2012 10.30
select * from <CF> where datetime1 > 03-22-2012 10.00 and datetime2 < 03-22-2012 11.00
select * from <CF> where datetime = 03-22-2012 (i.e. for the entire day)

Solution

  • This is a great introduction to working with dates and times in Cassandra: Basic Time Series with Cassandra.

    In short, use timestamps (or v1 UUIDs) as your column names and set the comparator to LongType (or TimeUUIDType) in order to get chronological sorting of the columns. It's then easy to get a slice of data between two points in time.

    Your question isn't totally clear about this, but if you want to get all events that happened during a given range of time of day regardless of the date, then you will want to structure your data differently. In this case, column names may be CompositeType(LongType, AsciiType), where the first component is a normal timestamp mod 86400 (the number of seconds in a day), and the second component is the date or something else that changes over time, like a full timestamp. You would also want to break up the row in this case, perhaps dedicating a different row to each hour.