Search code examples
mysqlsqldatabaseperformancedatabase-performance

SQL Performance: comparing n values


Let's suppose we want to retrieve n elements (type string/varchar(64)) from a table.

We suppose 0 <= n < 1000.

Which of those 2 is the fastest and require the less resources ?

SELECT id, last_name 
FROM Users 
WHERE last_name="a" 
OR last_name="b" 
OR last_name="c" 
OR last_name="d"....

vs

SELECT id, first_name, last_name 
FROM mytable

Then a code treatment for filtering on the output, for example in simple Python:

result = []
for last_name in output:
    if last_name in POSSIBLE_NAMES:
        result.append(last_name)

Solution

  • The first should be the fastest. Filtering in the database is the best route to go -- I cannot think of any exceptions off-hand.

    The code should be written as:

    SELECT id, last_name
    FROM Users
    WHERE last_name IN ('a', 'b', 'c', 'd', . . . );
    

    (If you want first_name too, then put that in the SELECT.)

    Single quotes are the ANSI-standard for string constants and IN is shorter (and better performing!) than a bunch of OR expressions.

    Why is this better?

    1. Less data is returned from the database to the application. This is the number one reason.
    2. The database can use an index to find the rows, if one is available.
    3. MySQL optimizes IN with a list of constants by doing a boolean search.
    4. Many databases (but not MySQL) can do the comparisons in parallel.