Search code examples
sqlsql-serversqlperformance

SQL: like v. equals performance comparison


I have a large table (100 million rows) which is properly indexed in a traditional RDBMS system (Oracle, MySQL, Postgres, SQL Server, etc.). I would like to perform a SELECT query which can be formulated with either of the following criteria options:

One that can be represented by a single criteria:

LIKE "T40%" 

which only looks for matches at the beginning of the string field due to the wildcard

or

One that requires a list of say 200 exact criteria:

WHERE IN("T40.x21","T40.x32","T40.x43")

etc.

All other things being equal. Which should I expect to be more performant?


Solution

  • Assuming that both queries return the same set of rows (i.e. the list of items that you supply in the IN expression is exhaustive) you should expect almost identical performance, perhaps with some advantage for the LIKE query.

    • RDBMS engines have been using index searches for begins-with LIKE queries, so LIKE 'T40%' will produce records after an index search
    • Your IN query would be optimized for index search as well, perhaps giving RDBMS a tighter lower and upper bounds. However, there would be an additional filtering step to eliminate records outside your IN list, which is a waste of CPU cycles under the assumption that all rows would be returned anyway.

    In case you'd parameterize your query, the second query becomes harder to pass to an RDBMS from your host program. All other things being equal, I would use LIKE.