Search code examples
mysqlquery-optimization

Do duplicates in the IN statement have an impact on performance?


Say we have a table called countries with a primary key called countrycode, and the following statement:

select * from countries where countrycode in ('BE', 'FR', 'BE', 'BE');

This statement will only return 2 unique countries.

Do multiple entries in the in clause have an effect on performance? How many lookups does MySQL do? (2 or 4?) Does SQL filter out duplicates automatically?


Solution

  • Modern database optimizers are smart enough to recognize that 'BE' appears multiple times and will treat the duplicates as a single instance.

    MySQL will likely build an execution plan where it scans for 'BE' and 'FR' without being influenced by the repetition.

    MySQL effectively performs 2 lookups in this case, despite the presence of 4 values.

    SQL does not filter out duplicates automatically before execution, but the optimizer handles them efficiently.