A portion of my application consists of a discussion board: there are threads, posts, and categories. Threads are grouped by category, and posts are grouped by threads. I'm having a problem coming up with a model / query that will allow the selection threads by category with a descending ordering of their last post.
Category
CREATE TABLE keyspace.categories (
id ascii PRIMARY KEY,
description text,
name text,
...
);
Thread
CREATE TABLE keyspace.threads (
id ascii PRIMARY KEY,
category_id ascii,
content text,
...
);
Post
CREATE TABLE keyspace.posts (
thread_id ascii,
created_at timestamp,
id ascii,
content text,
...
PRIMARY KEY (thread_id, created_at, id)
);
I initially thought about putting the last post's "created at" time as a clustering key on the thread table, but that's impossible as it changes with each post.
I then thought about creating an intermediate table that is written to every time a post is created. This solves the immutability issue with the first approach, but the problem is that it will contain multiple values per thread and I have not been able to figure out a partition / clustering order that would support grouping by thread and ordering by date.
For example, the following would allow me to group by thread, but not order by date:
CREATE TABLE last_post_for_category (
category_id ascii,
thread_id ascii,
created_at timestamp,
PRIMARY KEY ((category_id), thread_id, created_at)
) WITH CLUSTERING ORDER BY (thread_id DESC, created_at DESC);
SELECT thread_id FROM last_post_for_category WHERE category_id = 'category' GROUP BY thread_id, created_at;
And the following would allow me to order by date, but not group by thread:
CREATE TABLE keyspace.last_post_for_category (
category_id ascii,
thread_id ascii,
created_at timestamp,
PRIMARY KEY ((category_id), created_at, thread_id)
) WITH CLUSTERING ORDER BY (created_at DESC, thread_id DESC);
SELECT thread_id FROM last_post_for_category WHERE category_id = 'category' GROUP BY created_at, thread_id;
I'm not able to do a distinct
on (category_id, thread_id)
either as I know nothing about thread IDs at the point in which this query is executed.
Does anyone have any idea on how I can best represent this ordering?
First of all, I recommend that you use the datatype datetime
rather than timestamp
, since it will make it easy for you to modify it or set a default value. This is just a recommendation.
Suggested solution:
Add the attribute last_post
to the table threads
to save the time of last added post in each thread.
When a thread is first created, the last_post
value should equal a very old date (because no posts in that thread yet).
After tha, create a trigger so that whenever a post is inserted in posts
, the trigger updates the last_post
value of the corresponding thread. The trigger can be added like this:
CREATE TRIGGER triggerName ON posts
FOR INSERT
AS
declare @post_time datetime;
declare @thread_id int;
select @post_time=i.created_at from inserted i;
select @thread_id=i.thread_id from inserted i;
update threads set lastpost = @post_time where id=@thread_id
GO
Last step will be a direct query to select threads by category sorted by last_post
, just like this:
select * from threads where category_id = 'theCategoryYouWant' order by lastpost asc /*or desc as you like*/
Note: if you want created_at
to be updated when the post is edited, you will need to add a similar trigger to update the last_post
attribute of the corresponding thread.