Search code examples
mysqlsqlsqlperformance

What is the performance impact on applying filters on outer query?


I have to use nested query and apply different filters and then the limit. My query :

select * from ( select * from JOINS ON A LOT OF TABLES) as t1 where t1.col1="A" and t1.col2="B" limit 100;
  1. Would this first, get all rows from inside query and next, putting limit and filters on it?

  2. Assuming I am able to put the filters and limit in the inner query would it be more optimal?

    select * from JOINS ON A LOT OF TABLES where col1="A" col2="B" limit 100
    
  3. Would having be worse than where?

    select * from JOINS ON A LOT OF TABLES having col1="A" col2="B" limit 100
    

In query analysis from Workbench the data looks same in both cases.


Solution

  • It depends on the version of MySQL you use. The developers of MySQL improve the optimizer in each release. For example, MySQL 8.0.22 has implemented "derived condition pushdown optimization" which seems like exactly what you're looking for.

    https://dev.mysql.com/doc/refman/8.0/en/derived-condition-pushdown-optimization.html says in part:

    For a query such as SELECT * FROM (SELECT i, j FROM t1) AS dt WHERE i > constant, it is possible in many cases to push the outer WHERE condition down to the derived table, in this case resulting in SELECT * FROM (SELECT i, j FROM t1 WHERE i > constant) AS dt.

    Read the full page I linked to for details. As one might expect, there are a lot of nuances to this feature. It may work in some cases but not in others.

    Older versions of MySQL don't implement this optimization, so the inner query must produce a temporary table of the full result with no filtering, then the condition is applied to the temporary table. This frequently causes performance problems, because of the overhead of inserting a large number of unfiltered rows into the temporary table.