I have the following tables:
Table 1 : Contacts
Fields : id first_name
Values :
1 Reeta
2 Rohan
3 John
Table 2 : email (it contains contact_id of contacts table)
Fields : id contact_id email_address
Values :
1 1 r@gmail.com
2 2 r@gmail.com
3 3 j@gmail.com
I want to display all duplicates by email. Like this:
cont_id first_name email_address
1 Reeta r@gmail.com
2 Rohan r@gmail.com
Here is my query :
select contact_id
from contacts
where email_address IN (
SELECT S.email_address
FROM contacts R
INNER JOIN email
ON R.id = S.contact_id
Group By email_address
Having Count(S.id) > 1
);
The query takes long time to execute with large number of records. However the inner query works faster but not the outer one. Please Help.
Yes, it is the problem with MySQL query optimizer. The thing is - IN (subquery)
for MySQL is same thing as = ANY ()
subquery. And MySQL won't optimize that even if subquery returns few count of values which in normal situation will be easily compared via index (if it exists on column for which we're doing IN
).
Do not confuse that with IN (<static values list>)
- it is completely different case, when MySQL treats IN
not as specific type of subquery, but as comparison operator - and, therefore, will use index.
Therefore, one of possible solutions is - separate your subquery so it will return some list of values, then substitute that list to IN ()
, getting index scan because of treating IN
as comparison operator. However, in some cases solution may be got via replacing subquery with JOIN
. It's not possible in all cases, thus solution, described above, will fit for common non-trivial case.