How can I improve DELETE FROM performance on large InnoDB tables?
This solution can provide better performance once completed, but the process may take some time to implement.
A new BIT
column can be added and defaulted to TRUE
for "active" and FALSE
for "inactive". If that's not enough states, you could use TINYINT
with 256 possible values.
Adding this new column will probably take a long time, but once it's over, your updates should be much faster as long as you do it off the PRIMARY
as you do with your deletes and don't index this new column.
The reason why InnoDB takes so long to DELETE
on such a massive table as yours is because of the cluster index. It physically orders your table based upon your PRIMARY
, first UNIQUE
it finds, or whatever it can determine as an adequate substitute if it can't find PRIMARY
or UNIQUE
, so when one row is deleted, it now reorders your entire table physically on the disk for speed and defragmentation. So it's not the DELETE
that's taking so long; it's the physical reordering after that row is removed.
When you create a fixed width column and update that instead of deleting, there's no need for physical reordering across your huge table because the space consumed by a row and table itself is constant.
During off hours, a single DELETE
can be used to remove the unnecessary rows. This operation will still be slow but collectively much faster than deleting individual rows.
I had a similar scenario with a table with 2 million rows and a delete statement, which should delete around a 100 thousand rows - it took around 10 minutes to do so.
After I checked the configuration, I found that MySQL Server was running with default innodb_buffer_pool_size
= 8 MB (!).
After restart with innodb_buffer_pool_size
= 1.5GB, the same scenario took 10 sec.
So it looks like there is a dependency if "reordering of the table" can fit in buffer_pool or not.