Think web forum structure here. My current query gets the timestamps from the oldest and newest posts associated with each thread. What I can't seem to figure out is how to get the userid associated with those posts. I read through similar questions on here, but they had simpler structures where all the aggregated data could be grouped back to a single column. Do I need to break down the min, max, and count into 3 separate subqueries? I tried looking at adding userid to the group clause but that only duplicates the data for each user who owns a post in the threads. I'm at a loss on what direction to go from here.
https://www.db-fiddle.com/f/akG5WjvoBFPzzKXSBhF3Zz/0
Dummy tables:
CREATE TABLE post (id int NOT NULL AUTO_INCREMENT, postDate datetime NOT NULL DEFAULT CURRENT_TIMESTAMP, userid int NOT NULL, threadId int NOT NULL, PRIMARY KEY(id));
INSERT INTO post (postDate, userid, threadId) VALUES("2025-3-1 16:30:00", 1, 7);
INSERT INTO post (postDate, userid, threadId) VALUES("2025-3-1 17:57:00", 2, 7);
INSERT INTO post (postDate, userid, threadId) VALUES("2025-3-1 19:23:00", 3, 7);
CREATE TABLE users (id int NOT NULL AUTO_INCREMENT, name varchar(50), PRIMARY KEY(id));
INSERT INTO users (id, name) VALUES(1, "Bob");
INSERT INTO users (id, name) VALUES(2, "Steve");
INSERT INTO users (id, name) VALUES(3, "Mary");
CREATE TABLE thread (id int NOT NULL AUTO_INCREMENT, title varchar(100), PRIMARY KEY(id));
INSERT INTO thread (id, title) VALUES(5, "Another thread");
INSERT INTO thread (id, title) VALUES(6, "Some foxy stuff");
INSERT INTO thread (id, title) VALUES(7, "This is a test");
Query:
SELECT t.id, t.title, x.pFirst, x.pLast, x.postCount, x.userid
FROM thread t
LEFT JOIN (
SELECT threadId, MIN(postDate) pFirst, MAX(postDate) pLast, count(p.id) as postCount
FROM post p
GROUP BY threadId
) x ON x.threadId = t.id
ORDER BY pLast DESC
Current Result:
id | title | pFirst | pLast | postCount |
---|---|---|---|---|
7 | This is a test | 2025-03-01 16:30:00 | 2025-03-01 19:23:00 | 3 |
5 | Another thread | null | null | null |
6 | Some foxy stuff | null | null | null |
Result I'd like to get:
id | title | pFirst | uFirst | pLast | uLast | postCount |
---|---|---|---|---|---|---|
7 | This is a test | 2025-03-01 16:30:00 | 1 | 2025-03-01 19:23:00 | 3 | 3 |
5 | Another thread | null | null | null | ||
6 | Some foxy stuff | null | null | null |
You can detect first and last post with window functions row_number() and count().
Row row_number()=1 is first and row_number()=count()(...) is last.
Then this 2 rows for every thread pivot by conditional aggregation.
Also classic query example (see bottom of answer) works well with index.
See example
with first_last as(
select threadId,cnt
,min(case when rn=1 then userId end) uFirst
,min(case when rn=cnt then userId end) uLast
,min(case when rn=1 then postDate end) pFirst
,min(case when rn=cnt then postDate end) pLast
from(
select *
,row_number()over(partition by threadId order by postDate)rn
,count(*)over(partition by threadId)cnt
from post
)q
where rn=1 or rn=cnt
group by threadId
)
select t.*, uFirst,pFirst,uLast,pLast,cnt as postCount
from thread t
left join first_last p on p.threadid=t.Id
id | title | uFirst | pFirst | uLast | pLast | postCount |
---|---|---|---|---|---|---|
5 | Another thread | null | null | null | null | null |
6 | Some foxy stuff | null | null | null | null | null |
7 | This is a test | 1 | 2025-03-01 16:30:00 | 3 | 2025-03-01 19:23:00 | 3 |
Let's consider another example. If CTE result is materialized (this 2 row per thread output), performance will be better.
with first_last as(
select *
from(
select *
,row_number()over(partition by threadId order by postDate)rn
,count(*)over(partition by threadId)cnt
from post
)q
where rn=1 or rn=cnt
)
select t.*
,p1.userId as uFirst,p1.PostDate as pFirst
,p2.userId as uLast,p2.PostDate as pLast
,p1.cnt postCount
from thread t
left join first_last p1 on p1.threadid=t.Id and p1.rn=1
left join first_last p2 on p2.threadid=t.Id and p2.rn=p2.cnt
Update1
For mySql version 5.7 (and other versions - no specific functions)
select t.*
,p1.userId as uFirst,p1.PostDate as pFirst
,p2.userId as uLast,p2.PostDate as pLast
,fl.cnt postCount
from thread t
left join (
select threadId,min(postDate)pFirst,max(postDate)pLast,count(*) cnt
from post
group by threadId
) fl on fl.threadid=t.Id
left join post p1 on p1.threadid=t.Id and p1.postDate=fl.pFirst
left join post p2 on p2.threadid=t.Id and p2.postDate=fl.pLast
By the way, it's not bad in terms of performance.
Index ix_posts_thread_postdate on post (threadId,postDate) match to this query.
model fiddle
and for MariaDb