So I am storing user events in Cassandra and am looking for the right key'ing for the table.
CREATE TABLE user_events (
user text,
timestamp timestamp,
ip text,
event text,
content text,
service text,
PRIMARY KEY (user, timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC)
AND compaction = { 'class' : 'DateTieredCompactionStrategy' };
I know there is a limit to a single partition ( I think ~1B ). I do not plan on deleting data as it gets older. Would I need to also key this by month or something? eg:
PRIMARY KEY((user, month) timestamp)
Or if there is a more optimal way or storing events for time-series data.
Don't use DateTiered, use TimeWindow. Second you should write as you expect to read (ex: List all the SELECT queries you want, and then model after that). But avoid large partitions.
There are several ways of avoid big partitions if you want to look for user events based on time.
The second way has the advantage of segregating data and allow you to move/store/change settings as you go instead of, in the future if you need to change something, you have to deal with a massive dataset. Also, if you ever plan to delete in the future (let's say, GDPR), you avoid tombstones as you drop the full tables.