I'm trying to get the informations from two tables, one table that contains general data about an object A, and B holding the history of supplementary data for this object.
What i want to do is get all the records from A And the corresponding lastest instance of B (highest id) with a fast request, the two tables being quite big. B has a key that allows it to identify which A it is linked to since it is a one to many relation. I also want to get infos if there are some from a third table C, i can get those info from two fields that are present in B. There also can be no C data for a given B, i want to know that too..
From what i seached on the forums, lateral join seems to be the solution, but it does not exist in my version of MySQL.
I also want to be able to filter query with variable conditions in A and B column values.
What i'm trying right now (functionnal if i can wait hours) is:
SELECT a.*, b.*, c.*
FROM a
INNER JOIN (SELECT b.* FROM b ORDER BY b.id DESC) b ON b.a_id = a.id
LEFT JOIN c ON c.val1 = b.val1 AND c.val2 = b.val2
GROUP BY b.id
What your describing is something that window functions like ROW_NUMBER() help with but those too are MySQL 8+. With MySQL 5.7 things are a little trickier but there are a couple of ways to do it.
SET @rank := 0;
SET @aid := 0;
SELECT
a.*, b.*, c.*
FROM
a
INNER JOIN (
SELECT
@rank := CASE WHEN b.a_id = @aid THEN @rank + 1 ELSE 1 END AS RankedRow
,b.*
,@aid := b.a_id
FROM
b
ORDER BY
b.a_id
,b.id DESC
) AS b
ON a.id = b.a_id
AND b.RankedRow = 1
LEFT JOIN c
ON c.val1 = b.val1
AND c.val2 = b.val2
This uses a variable to hold the current row number and uses a trick to update that value as well as return it. If a_id is the same, it's incremented by 1 but when a_id changes, it is reset to 1.
Another option is to join b to itself
SELECT
a.*, b.*, c.*
FROM
a
INNER JOIN (
SELECT
b.*
FROM
b
LEFT JOIN b AS newerb
ON newerb.a_id = b.a_id
AND newerb.id > b.id
WHERE
newerb.id IS NULL
) AS b
ON a.id = b.a_id
LEFT JOIN c
ON c.val1 = b.val1
AND c.val2 = b.val2
Here newerb will be null if there is no id greater than b for the same a_id and therefore must be the latest row.
I've used both solutions in the past, I tend to prefer the latter as it's more readable but it may be slower depending on how much data you have.
I haven't time to test both examples so apologies in advance for any typos I've missed.