I know this restriction and alternative to rewrite SQL. But I wonder the reason why? Can anyone provide a inference on this restriction?
A query like
select * from table where id in (select id from othertable)
will basically be interpreted as
select * from table where exists
(select id from othertable where table.id = othertable.id)
That is what you expect for that query. You especially expect the IN
query to use the index on othertable.id
. In the manual, it is described as
Some optimizations that MySQL itself makes are: [...]
- MySQL rewrites IN, ALL, ANY, and SOME subqueries in an attempt to take advantage of the possibility that the select-list columns in the subquery are indexed.
These are, not by pure coincidence, exactly the four operators mentioned in the error message if you add a limit
:
select * from table where id in
(select id from othertable order by id limit 10)
ERROR 1235 (42000): This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery'.
Rewriting that query in a similar way is not directly possible anymore, as it is something different than
select * from table where exists
(select id from othertable where table.id = othertable.id
order by id limit 10)
To execute the IN
with a limit, MySQL could retrieve the 10 first rows of othertable
, store that resultset as a derived subtable and check if the id is in there. And you can of course do exactly that:
select * from table where id in
(select id from
(select id from othertable order by id limit 10) subtable)
This will, in analogy to the first example, be interpreted as
select * from table where exists
(select * from
(select id from othertable order by id limit 10) subtable
where table.id = subtable.id)
So it is a mixture of how the optimizer works (it will rewrite the query), how limit
works (it stops execution after finding rows, not skipping them), what is expected (the use of indexes) and ultimately if the developers decided to allow a specific syntax or not.
You could argue that MySQL could always fall back to execute the query as a derived table if it encounters IN
with a limit
subquery - but so could you by explicitely using a derived subtable. You could also argue that you can think of ways to implement that or implement that differently - and you are right, there are. That's why there is a "yet" in the error message. So feel free to implement them or at least describe them e.g. in a feature request as thoroughly as possible and under consideration of how all the other parts of MySQL work. But make sure they are actually faster than just using a subtable.