Say we have a table called countries
with a primary key called countrycode
, and the following statement:
select * from countries where countrycode in ('BE', 'FR', 'BE', 'BE');
This statement will only return 2 unique countries.
Do multiple entries in the in clause have an effect on performance? How many lookups does MySQL do? (2 or 4?) Does SQL filter out duplicates automatically?
Modern database optimizers are smart enough to recognize that 'BE' appears multiple times and will treat the duplicates as a single instance.
MySQL will likely build an execution plan where it scans for 'BE' and 'FR' without being influenced by the repetition.
MySQL effectively performs 2 lookups in this case, despite the presence of 4 values.
SQL does not filter out duplicates automatically before execution, but the optimizer handles them efficiently.