mysql time-complexity query-optimization complexity-theory space-complexity

How to judge the complexity of SQL queries

Any resource where it is explained how to judge the complexity of SQL queries would be much appreciated.

Solution

(By "complexity", I assume you mean "slowness"?) Some tips:

Subqueries may or may not slow down a query a lot.
GROUP BY and ORDER BY -- when both are present but different: Usually requires two sorts.
Usually only a single index is used per SELECT.
OR is almost always inefficient. Switching to UNION allows multiple indexes to be efficiently used.
UNION ALL, with a few restrictions, is more efficient than UNION DISTINCT (because of the dedupping pass)
Non-sargeable expressions cannot use an index, hence severely inefficient.
If the entire WHERE, GROUP BY and ORDER BY are handled by a single index can LIMIT be efficiently handled. (Else it must collect all the stuff, sort it, only then can it peel off a few rows.)
Entity-Attribute-Value schema is inefficient.
UUIDs and GUIDs are inefficient on very large tables.
A composite index is often better than a single-column index.
A "covering" index is somewhat better.
Sometimes, especially when a LIMIT is involved, it is better to turn the query inside-out. That is start with a subquery that finds the few ids that you need, then reaches back into the same table and into other tables to get the rest of the desired columns.
"Windowing functions" are poorly implemented in MySQL 8 and MariaDB 10.2. They are useful for "groupwise-max" and "hierarchical schemas". Until the Optimizer improves, I declare them to be "complex".
Recent versions have recognized "row constructors"; previously they were a performance hit.
Having an AUTO_INCREMENT id hurts performance in certain cases; helps in others.

EXPLAIN (or EXPLAIN FORMAT=JSON) tells you what is going on now; it fails to tell you how to rewrite the query or what better index to add.

More indexing tips: http://mysql.rjweb.org/doc.php/index_cookbook_mysql In that link, see "Handler counts" for a good way to measure complexity for specific queries. I use it for comparing query formulations, etc, even without populating a large table to get usable timings.

Give me a bunch of Queries; I'll point out the complexities, if any, in each.