Search code examples
mysqlsqldatabase-performancesql-like

MySQL like performance on OR using index is better than %%?


Is better use this SQL code suppose the right index in apply on the column!!

Suppose constant is a input from a textfield!!

select ...
from .....
where lower(column) like 'Constant%' or lower(column) like '%Constant%'

Is better than?

select ...
from .....
where lower(column) like '%Constant%'

In the first code i try to match a "constant" using like but using a index trying being lucky to find a match and later i try to do a full match!!

All i want is my performance is not decreased! I mean if both queries runs in the same time or if the query can sometimes get a performance upgrade is OK with me

I use lower because we use DEFAULT CHARSET=utf8 COLLATE=utf8_bin


Solution

  • I created a little table:

    create table dotdotdot (
      col varchar(20),
      othercol int,
      key(col)
    );
    

    I did an EXPLAIN on a query similar to the one you showed:

    explain select * from dotdotdot where lower(col) = 'value'\G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: dotdotdot
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 1
         filtered: 100.00
            Extra: Using where
    

    Notice the type: ALL which means it can't use the index on col. By using the lower() function, we spoil the ability for MySQL to use the index, and it has to resort to a table-scan, evaluating the expression for every row. As your table gets larger, this will get more and more expensive.

    And it's unnecessary anyway! String comparisons are case-insensitive in the default collations. So unless you deliberately declared your table with a case-sensitive collation or binary collation, it's just as good to skip the lower() function call, so you can use an index.

    Example:

    explain select * from dotdotdot where col = 'value'\G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: dotdotdot
       partitions: NULL
             type: ref
    possible_keys: col
              key: col
          key_len: 23
              ref: const
             rows: 1
         filtered: 100.00
            Extra: NULL
    

    The type: ref indicates the use of a non-unique index.

    Also compare to using wildcards for pattern-matching. This also defeats the use of an index, and it has to do a table-scan.

    explain select * from dotdotdot where col like '%value%'\G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: dotdotdot
       partitions: NULL
             type: ALL
    possible_keys: NULL
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 1
         filtered: 100.00
            Extra: Using where
    

    Using wildcards like this for pattern-matching is terribly inefficient!

    Instead, you need to use a fulltext index.

    You might like my presentation Full Text Search Throwdown and the video here: https://www.youtube.com/watch?v=-Sa7TvXnQwY


    In the other answer you ask if using OR helps. It doesn't.

    explain select * from dotdotdot where col like 'value%' or col like '%value%'\G
    *************************** 1. row ***************************
               id: 1
      select_type: SIMPLE
            table: dotdotdot
       partitions: NULL
             type: ALL
    possible_keys: col
              key: NULL
          key_len: NULL
              ref: NULL
             rows: 1
         filtered: 100.00
            Extra: Using where
    

    Notice the optimizer identifies the col index as a possible key, but then ultimately decides not to use it (key: NULL).