I have a MySql Events table like so:
+---------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+--------------+------+-----+---------+----------------+
| EventId | int(11) | NO | PRI | NULL | auto_increment |
| ControllerId | int(11) | NO | MUL | NULL | |
| EventTypeId | int(11) | NO | MUL | NULL | |
| DateTime | datetime(3) | NO | MUL | NULL | |
| InputId | int(11) | YES | MUL | | |
...
| AdditionalInfo | text | YES | | | |
+---------------------+--------------+------+-----+---------+----------------+
It's got 200M records in it at the moment and it's been running file. To keep the speed up, I don't query it with any joins, but now I am starting to hit a few long running queries. An example of a slow running query is like so:
SELECT E.*
FROM Event E
WHERE (E.EventTypeId != 4 OR (E.EventTypeId = 4 AND E.InputId IS NOT NULL))
AND E.EventTypeId != 27 AND E.EventTypeId != 12
AND E.ControllerId in (5190, 5191, 5192, 5193)
ORDER BY E.DateTime DESC
LIMIT 0, 200
That query takes over 5min! The important (I think) part of the explain comes out like so:
"key_length": "7",
"rows_examined_per_scan": 180071,
"rows_produced_per_join": 125770,
"filtered": "0.06",
"cost_info": {
"read_cost": "284389.84",
"eval_cost": " 25154.17",
"prefix_cost": "309544.01",
"data_read_per_join": "20M"
},
Now if I remove the ORDER BY E.DateTime DESC
at the end of the query it takes about 0.1s to complete. I already have an index on DateTime.
I think I understand the concept that the server has to read over all the 180k? rows returned by the WHERE
clauses to sort them before returning it to the client, but why does it take sooo long? Is there anything I can do about it? Would a composite index help here?
It should not take 5 minutes to sort 180k rows, unless your hardware is really, really slow. For this query:
SELECT E.*
FROM Event E
WHERE (E.EventTypeId <> 4 OR (E.EventTypeId = 4 AND E.InputId IS NOT NULL)
) AND
E.EventTypeId NOT IN (12, 27) AND
E.ControllerId in (5190, 5191, 5192, 5193)
ORDER BY E.DateTime DESC
LIMIT 0, 200;
You might try an index (ControllerId, EventTypeId, InputId)
. However, I'm guessing this will not be very good.
One possibility is to use the above index and then do one controller at a time:
(SELECT E.*
FROM Event E
WHERE (E.EventTypeId <> 4 OR (E.EventTypeId = 4 AND E.InputId IS NOT NULL)
) AND
E.EventTypeId NOT IN (12, 27) AND
E.ControllerId = 5190
ORDER BY E.DateTime DESC
LIMIT 0, 200
) UNION ALL
(SELECT E.*
FROM Event E
WHERE (E.EventTypeId <> 4 OR (E.EventTypeId = 4 AND E.InputId IS NOT NULL)
) AND
E.EventTypeId NOT IN (12, 27) AND
E.ControllerId = 5191
ORDER BY E.DateTime DESC
LIMIT 0, 200
)
. . .
ORDER BY DateTime DESC
LIMIT 0, 200;
The index can be used more effectively for each of the subqueries.