Hopefully this is an easy problem, and I'm just thinking about it wrong.
I have an association between id_1
and id_2
. There is a 1->n relationship between id_1
and id_2
. Namely, if no id_2
exists for id_1
it has a default value of -1. This can look like:
id_1 id_2 info
---- | ---- | ----
120 -1 'dont'
120 444 'show'
123 -1 'test'
124 -1 'hello'
125 -1 'world'
125 123 'oh wait'
126 -1 'help'
126 201 'me'
127 -1 'sql'
127 206 'hell'
Basically I want to have a SQL query that gets the highest id_2
associated with an id_1
, given an IN
statement matching criteria from id_1
. If I can't have an IN
statement that's also fine as well, I'm just trying to optimize something else.
Here is what the answer should look like given id_1
containing (123,124,125,126,127)
and id_2
containing (-1,-1,123,201,206)
:
id_1 id_2 info
---- | ---- | ----
123 -1 'test'
124 -1 'hello'
125 123 'oh wait'
126 201 'me'
127 206 'hell'
Notice how id_1
120
ends up excluded. I've tried:
SELECT DISTINCT id_1, id_2, info
FROM ids
WHERE
id_1 IN (123, 124, 125, 126, 127) AND
id_2 IN (-1, -1, 123, 201, 206)
But this gives multiple id_2
values for a single id_1
.
I've also tried a GROUP BY
on id_2
but that only gives me one id_1
that has -1.
SELECT id_1, id_2, info
FROM ids
WHERE
id_1 IN (123, 124, 125, 126, 127) AND
id_2 IN (-1, -1, 123, 201, 206)
GROUP BY id_2
results in:
id_1 id_2 info
----- | ---- | ----
123 -1 'test'
125 123 'oh wait'
126 201 'me'
127 206 'hell'
Note the missing id_1
of 124
.
So how can I write this query to get the most recent, distinct id_2
for an id_1
?
You may user ROW_NUMBER
if on MySQL 8+:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY id_1 ORDER BY id_2 DESC) rn
FROM ids
WHERE id_1 IN (123, 124, 125, 126, 127) AND
id_2 IN (-1, -1, 123, 201, 206)
)
SELECT id_1, id_2, info
FROM cte
WHERE rn = 1;
On earlier versions of MySQL, you can identify the max record per id_1
group using exists logic:
SELECT i1.*
FROM ids i1
WHERE id_1 IN (123, 124, 125, 126, 127) AND
id_2 IN (-1, -1, 123, 201, 206) AND
NOT EXISTS (
SELECT 1
FROM ids i2
WHERE i2.id_1 = i1.id_1 AND
i2.id_2 > i1.id_2 AND
i2.id_1 IN (123, 124, 125, 126, 127) AND
i2.id_2 IN (-1, -1, 123, 201, 206)
);