I have the following tables
user
id | name |
---|---|
1 | John |
2 | Jack |
3 | George |
essay
id | date_submitted | user_id | project_id |
---|---|---|---|
1 | 2020-12-10 | 1 | 1 |
2 | 2020-05-01 | 2 | 2 |
3 | 2020-04-10 | 3 | 2 |
project_phase
id | project_id | date | phase |
---|---|---|---|
1 | 1 | 2020-01-01 | early |
2 | 1 | 2020-05-31 | mid |
3 | 1 | 2020-11-30 | late |
4 | 2 | 2020-01-01 | early |
5 | 2 | 2020-03-31 | mid |
6 | 2 | 2020-04-30 | late |
1 User submits only 1 essay in one project and I want to have a combined table which shows me the user and the phase at which the essay was submitted.
I'm trying to get the record with the MAX date from project_phase that is below the date_submitted
So using the above data I want to get
user | date_submitted | phase |
---|---|---|
John | 2020-12-10 | late |
Jack | 2020-05-01 | late |
George | 2020-04-10 | mid |
so far I combined the tables with INNER JOIN but the following query returns duplicate records. I also tried ROW_NUMBER() and MAX() but I didn't structure it correctly to get the answer I'm looking for
SELECT
U.name,
E.date_submitted,
P.phase
FROM
essay E
INNER JOIN user U ON U.id = E.user_id
INNER JOIN project_phase P ON P.project_id = E.project_id and E.date <= P.date_submitted
Any help appreciated, thanks.
I would suggest a correlated subquery:
SELECT U.name, E.date_submitted,
(SELECT P.phase
FROM project_phase P
WHERE P.project_id = E.project_id AND
P.date <= E.date_submitted
ORDER BY p.date DESC
LIMIT 1
) as phase
FROM essay E JOIN
user U
ON U.id = E.user_id;