I'm brand new to Cassandra and I'm having trouble trying to figure out how to order my data. I'm trying to record messages from twitch.tv for each streamer. I want something like the following
CREATE TABLE IF NOT EXISTS chat_data.twitch_chat_by_broadcaster_and_timestamp (
broadcaster_id int,
timestamp int,
message text,
PRIMARY KEY (broadcaster_id, timestamp)
) WITH CLUSTERING ORDER BY (timestamp DESC);
While not an issue for a hobby project, how do you solve the problem of two messages being sent at the exact same time. I'm aware of TimeUUID
, but I want the messages sorted by the timestamp of when they were sent, and it doesn't look like there are any official ways to create a TimeUUID
other than now()
.
I envision using this database for getting all messages in a given time range for a given stream. I realize I could add another UUID
field to guarantee uniqueness, but then that would screw up my query since I have to specify an exact time.
What's the right way to order this data to allow efficient querying?
I realize I could add another UUID field to guarantee uniqueness, but then that would screw up my query since I have to specify an exact time.
Not if you added it as the last clustering column:
CREATE TABLE IF NOT EXISTS chat_data.twitch_chat_by_broadcaster_and_timestamp (
broadcaster_id int,
timestamp int,
message_id uuid,
message text,
PRIMARY KEY (broadcaster_id, timestamp, message_id)
) WITH CLUSTERING ORDER BY (timestamp DESC, message_id ASC);
This way, you could simply generate a new UUID at write-time. As long as you queried by the partition key of broadcaster_id
, the data would still be sorted by (both) timestamp
, with message_id
added to ensure uniqueness.
As far as I can see it, the query for messages by broadcaster_id
doesn't change:
> SELECT * FROM twitch_chat_by_broadcaster_and_timestamp WHERE broadcaster_id=68;
broadcaster_id | timestamp | message_id | message
----------------+---------------+--------------------------------------+---------------------------
68 | 1705328231991 | ac74c7ba-1813-4b75-b873-8817c6f89a55 | End of stream.
68 | 1705328230499 | 0e2116d5-887c-4433-b9cc-e5c13534902d | Next stream coming soon.
68 | 1705327984482 | 01c52a3a-c1c3-4db4-b8a8-6f9fe3111911 | Incoming messages!
68 | 1705327983581 | 7a3c2444-5746-4928-9e6a-0ebaac7fbec5 | Next stream starting soon