I have the following query:
SELECT *
FROM customers_cards
WHERE id NOT IN ( SELECT card_id FROM customers_card_psp)
It takes about 2sec to get executed and we need this query to get called on a highly visited page. Any idea how can I make it more optimize?
Here is the result of EXPLAIN
:
Noted that card_id
is not unique (there are 5 rows inside customers_card_psp
per 1 row inside customers_cards
), but adding DISTINCT
in the behind of card_id
doesn't make anything better.
Noted that, using LEFT JOIN
has the same performance:
SELECT cc.*
FROM customers_cards cc
LEFT JOIN customers_card_psp ccp ON ccp.card_id = cc.id
WHERE ccp.id IS null
These are 3 ways to say the same thing:
SELECT *
FROM customers_cards
WHERE id NOT IN (
SELECT card_id FROM customers_card_psp )
vs
SELECT *
FROM customers_cards AS cc
WHERE NOT EXISTS (
SELECT 1 -- anything works here; "1" is the convention
FROM customers_card_psp
WHERE card_id = cc.id )
vs
SELECT cc.*
FROM customers_cards AS cc
LEFT JOIN customers_card_psp AS psp
ON psp.card_id = cc.id -- How the tables 'relate'
WHERE psp.card_id IS NULL -- Means "missing from psp"
The NOT IN ( SELECT ... )
is notoriously inefficient and should be avoided.
The EXISTS
and LEFT JOIN
are probably compiled identically, hence equally good.
The EXISTS
is called "semi-join" meaning that it only needs to find 0 or 1 row, not all 5, in order to satisfy the test.
Related: The comparisons are probably the same without the NOT
(and changing to IS NULL
). That is, do not use IN
; use either of the others.
(Different versions of MySQL/MariaDB have optimized these things differently.)