Search code examples
mysqloptimizationviewquery-optimization

Is MySQL smart enough to optimize referential integrity in unused joins?


Here are two equivalent queries (because of the foreign key constraint + primary key):

CREATE TABLE customers ( id int PRIMARY KEY );
CREATE TABLE orders ( id int, customer_id int );
ALTER TABLE orders ADD FOREIGN KEY (customer_id) REFERENCES customers(id);

SELECT id
     , (SELECT 1 FROM customers c WHERE c.id = o.id) one
  FROM orders o;

SELECT o.id
     , 1 one
  FROM orders o
  JOIN customers c
    ON c.id = o.customer_id;

In both cases, every row of orders will have one row in the result.

That also means that if all of (one of) these queries is a subquery of some other query, and that other query does not use the one column then the joining can be skipped.

My question is: is MySQL smart enough to skip the join in both of these scenarios? And what if I switch to a NULL column and a LEFT JOIN instead?

I am now working on making some views to denormalize some tables for reporting. So I need to understand if there is some canonical formatting (bikeshedding?) I should use for things like this to ensure my huge views are efficiently and minimally processed as needed.


Solution

  • It is definitely capable of optimizing away unused columns and lookups from subqueries, CTEs, and VIEWs.

    Exactly what it does depends on the version and exact DDL, queries, and index statistics. On MariaDB ANALYZE FORMAT=JSON SELECT ... will show you the actual execution plan. On Oracle MySQL, use EXPLAIN ANALYZE SELECT ....

    The thing to keep in mind is that query plans change as tables grow (and as statistics get stale, do ANALYZE TABLE once in a while). And, real-world application query patterns are very hard to predict.

    If your app gets really big, you may want to disable the foreign key checks and enforce integrity with good code instead. Enforced FKs take cpu time and IO. So, other things being equal, don't rely on constraints like FKs. Write your code so it doesn't need them if possible. (But, of course, if your code makes more sense with FK constraints enabled, go ahead and write it that way.)

    Design your queries, CTEs, VIEWs, and all that so they're easy to read and reason about. Get your app working. Then, if you're fortunate enough to have a growing app with growing tables, revisit the bottleneck query plans once every three months or so.

    I've designed tables, indexes, and constraints, put them into production, then had to explain them to my successors. I never, even once, though, "wow, I should have made this database more complex or more clever" when I was handing off my work. A lot of time I thought "I wish my former self had designed this more simply".

    There's a slogan kicking around saying "premature optimization is the root of all evil." That's an exaggeration. But it is fair to say that premature database optimization