I have a database table with ~140,000 rows, and right now I am doing a query that goes like
DELETE FROM database WHERE type="delete"
But it is taking a long time. I am tempted to halt the query and see if the database has actually shrunken in size, but I am not sure if it's in the process of creating a list of data to be deleted before the data actually gets removed. How does innoDB actually clear out data from the database upon receiving a DELETE query?
Here is the CREATE query (for the table images
):
CREATE TABLE IF NOT EXISTS `images` (
`imageID` int(11) NOT NULL AUTO_INCREMENT,
`runID_fk` int(11) DEFAULT NULL,
`sequenceID_fk` int(11) DEFAULT NULL,
`cameraID_fk` int(11) DEFAULT NULL,
`data` longblob,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`name` text,
`type` text,
`pcadata` longblob,
PRIMARY KEY (`imageID`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Based on your CREATE
statement, you do not have an index on the type
column. So, your problem is due to a slow WHERE
clause. Basically, your query is forcing the database server to look at each and every row to see if value in the type
column is "delete"
. That is slow, no matter how many rows you have. An index would make this much faster because the server could essentially just ask for a list of all matching rows, then only operate on those rows.
As a very broad rule of thumb, if you intend to use a column in a WHERE
clause, it should be indexed. (There are exceptions, for sure, but this is not one of them based on what you've posted.)
You can see what the database server is doing by running an EXPLAIN
query, as in
EXPLAIN DELETE FROM database WHERE type="delete"
Based on your comment, your EXPLAIN
returns something like this:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 SIMPLE images ALL null null null null 145669 Using where
That 145,699 that you see is not (as you wrote in the comment) "the size of the rows." It's the number of rows that the server has to look at to execute your query. As you can see, it doesn't have any indexes it can use. So, instead of looking at only the rows you want to delete, it has to look at 145,669 rows.
Also, since your type
column is a TEXT
column, this will be even slower. You should seriously consider adding a TINYINT
, CHAR
(not TEXT
), or other column to store the status of the row if you intend to flag rows for deletion this way.
On a more general note, you really should reconsider some of those column types. I seriously doubt that you need 2^16 - 1
bytes for a column named name
, for example.