Why Mysql do not create index in subquery?

I am running MySQL 5.7.24.
I have two tables: Table a and Table b, described below:

-------------------

CREATE TABLE a (

  id int(11)         NOT NULL AUTO_INCREMENT,
  a1 varchar(255)        NULL,
  a2 varchar(255)        NULL,
  a3 varchar(255)        NULL,
  a4 varchar(255)        NULL,
  b5 varchar(255)        NULL,

  PRIMARY KEY (id),
  INDEX a_idx_b5 (b5) USING BTREE

) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;

-------------------

CREATE TABLE b (

  id int(11)         NOT NULL AUTO_INCREMENT,
  B1 varchar(255)        NULL,
  B2 varchar(255)        NULL,
  B3 varchar(255)        NULL,
  B4 int(255)            NULL,
  B5 varchar(255)        NULL,

  PRIMARY KEY (id),
  INDEX b_idx_b5 (B5) USING BTREE

) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8mb4;

-------------------

And I want to get data from table a and table b with the following SQL query:

select
    a.*,
    max_b4
from
    a,
    (
        select
            b5,
            max(b4) as max_b4
        from
            b
        group by
            b5
    ) c
where
    a.b5 = c.b5

When I have a lots of data (more than one million) in table a and table b it become very slow, so then I explain the SQL, and I got this result:

+----+-------------+------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+
| id | select_type | table      | partitions | type  | possible_keys | key      | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL   | NULL          | NULL     | NULL    | NULL | 3209319 |   100.00 | Using where |
|  1 | PRIMARY     | a          | NULL       | ref   | a_idx_b5      | a_idx_b5 | 1023    | c.b5 |     459 |   100.00 | NULL        |
|  2 | DERIVED     | b          | NULL       | index | b_idx_b5      | b_idx_b5 | 1023    | NULL | 3209319 |   100.00 | NULL        |
+----+-------------+------------+------------+-------+---------------+----------+---------+------+---------+----------+-------------+

As you can see the <derived2> table don't have a <auto_key> which mysql will add to the subquery after 5.6! So this result why the sql become very slow.

But as the mysql doc describes derived-table-optimization:

The optimizer constructs an index over column f1 from derived_t2 if doing so would permit the use of ref access for the lowest cost execution plan.

Usually mysql will add an index in subquery, so my question is why mysql do not give an index in <derived2> table?

Solution

The index b_idx_b5 on the b table is not being used because it does not speed up the GROUP BY subquery. You should be using this index:

CREATE INDEX idx ON b (b5, b4);

This should allow MySQL to very efficiently find the max b4 values for each group of b5 records.

As a side note, I probably would have written your query using an explicit join, rather than an implicit one:

SELECT t1.*, t2.max_b4
FROM a t1
INNER JOIN
(
    SELECT b5, MAX(b4) AS max_b4
    FROM b
    GROUP BY b5
) t2
    ON t2.b5 = t1.b5;

But, both the above version and your original version would probably have the same execution plan.