Search code examples
mysqlsqlquery-optimizationsql-likecalculated-columns

MySQL virtual column and wildcard


I was trying MySQL secondary indexing referring to MySQL Documentation, and weird thing happened.

  • Firstly, I created a table with small modification per the example in the document
create table jemp(
c JSON,
g VARCHAR(20) GENERATED ALWAYS AS (c->"$.name"),
INDEX i (g)
)
  • Secondly, I inserted values per the example in the document
INSERT INTO jemp (c) VALUES
('{"id": "1", "name": "Fred"}'), ('{"id": "2", "name": "Wilma"}'),
('{"id": "3", "name": "Barney"}'), ('{"id": "4", "name": "Betty"}');
  • And then, I tried to perform a fuzzy search with "like" and "wildcard". This doesn't work because index doesn't support prefix %, but it can get result.
select c->"$.name" as name from jemp where g like "%F%"
  • Here is the weird thing, I removed the prefix %, and index did work. However, I didn't get any results. Per my poor understanding of MySQL, this should work.
select c->"$.name" as name from jemp where g like "F%"

I would be so much appreciate if anyone could help me with it.


Solution

  • For your query to work, you want a generated column that extracts the name as text rather than JSON. That is, use ->> instead of ->:

    g VARCHAR(20) GENERATED ALWAYS AS (c ->> '$.name')
    

    Then: the index may help for both following conditions:

    where g like 'F%'
    where g = 'F'
    

    Whether MySQL decides to use it or not is another story; basically the databases assesses whether using the index will be faster than a full scan. If it believes that the condition will match on a large number of rows, it will probably choose to full scan.

    Note that I consistently use single quotes for string literals; although MySQL tolerates otherwise, this is what the SQL standard specifies. In some other databases, double quotes stand for identifiers (this also is compliant with the standard).