MySQL - fastest way to ALTER TABLE for InnoDB
One sure way to speed up an ALTER TABLE is to remove unnecessary indexes
Here are the initial steps to load a new version of the table
CREATE TABLE s_relations_new LIKE s_relations;
#
# Drop Duplicate Indexes
#
ALTER TABLE s_relations_new
DROP INDEX source_persona_index,
DROP INDEX target_persona_index,
DROP INDEX target_persona_relation_type_index
;
Please note the following:
I dropped source_persona_index because it is the first column in 4 other indexes
- unique_target_persona
- unique_target_object
- source_and_target_object_index
- source_target_persona_index
I dropped target_persona_index because it is the first column in 2 other indexes
- target_persona_relation_type_index
- target_persona_relation_type_message_id_index
I dropped target_persona_relation_type_index because the first 2 columns are also in target_persona_relation_type_message_id_index
OK That takes care of unnecessary indexes. Are there any indexes that have low cardinality? Here is the way to determine that:
Run the following queries:
SELECT COUNT(DISTINCT sent_at) FROM s_relations;
SELECT COUNT(DISTINCT message_id) FROM s_relations;
SELECT COUNT(DISTINCT target_object_id) FROM s_relations;
According to your question, there are about 80,000,000 rows. As a rule of thumb, the MySQL Query Optimizer will not use an index if the cardinality of the selected columns is greater that 5% of the table row count. In this case, that would be 4,000,000.
- If
COUNT(DISTINCT sent_at)
> 4,000,000- then
ALTER TABLE s_relations_new DROP INDEX sent_at_index;
- then
- If
COUNT(DISTINCT message_id)
> 4,000,000- then
ALTER TABLE s_relations_new DROP INDEX message_id_index;
- then
- If
COUNT(DISTINCT target_object_id)
> 4,000,000- then
ALTER TABLE s_relations_new DROP INDEX target_object_index;
- then
Once the usefulness or uselessness of those indexes have been determined, you can reload the data
#
# Change the Column Name
# Load the Table
#
ALTER TABLE s_relations_new CHANGE sent_at sent_at_new int(11) DEFAULT NULL;
INSERT INTO s_relations_new SELECT * FROM s_relations;
That's it, right? NOPE !!!
If your website has been up this whole time, there may have INSERTs running against s_relations during the loading of s_relations_new. How can you retrieve those missing rows?
Go find the maximum id in s_relations_new and append everything after that ID from s_relations. To assure that the table is frozen and used only for this update, you must have a little downtime for the sake of getting those last rows that were inserted into s_relation_new. Here is what you do:
In the OS, restart mysql so that no one else can log in but root@localhost (disables TCP/IP):
$ service mysql restart --skip-networking
Next, login to mysql and load those last rows:
mysql> SELECT MAX(id) INTO @maxidnew FROM s_relations_new;
mysql> INSERT INTO s_relations_new SELECT * FROM s_relations WHERE id > @maxidnew;
mysql> ALTER TABLE s_relations RENAME s_relations_old;
mysql> ALTER TABLE s_relations_new RENAME s_relations;
Then, restart mysql normally
$ service mysql restart
Now, if you cannot take mysql down, you will have to do a bait-and-switch on s_relations. Just login to mysql and do the following:
mysql> ALTER TABLE s_relations RENAME s_relations_old;
mysql> SELECT MAX(id) INTO @maxidnew FROM s_relations_new;
mysql> INSERT INTO s_relations_new SELECT * FROM s_relations_old WHERE id > @maxidnew;
mysql> ALTER TABLE s_relations_new RENAME s_relations;
Give it a Try !!!
CAVEAT : Once you are satisfied with this operation, you can drop the old table at your earliest convenience:
mysql> DROP TABLE s_relations_old;
The correct answer depends on the version of the MySQL engine you're using.
If using 5.6+, renames and adding/removing indices are performed online, i.e. without copying all the table's data.
Just use ALTER TABLE
as usual, it'll be mostly instant for renames and index drops, and reasonably fast for index addition (as fast as reading all the table once).
If using 5.1+, and the InnoDB plugin is enabled, adding/removing indices will be online as well. Not sure about renames.
If using older version, ALTER TABLE
is still the fastest—but will probably be horribly slow because all of your data will be re-inserted to a temporary table under the hood.
Finally, time for myth debunking. Unfortunately I don't have enough karma here to comment on answers, but I feel it's important to correct the most voted answer. This is wrong:
As a rule of thumb, the MySQL Query Optimizer will not use an index if the cardinality of the selected columns is greater that 5% of the table row count
It's actually the other way around.
Indices are useful to select few rows, so it's important they have high cardinality, which means many distinct values and statistically few rows with the same value.
I had the same problem with Maria DB 10.1.12, then after reading the documentation I found that there is an option to perform the operation "in-place" which eliminates the copy of the table. With this option the alter table is very fast. In my case it was:
alter table user add column (resettoken varchar(256),
resettoken_date date, resettoken_count int), algorithm=inplace;
this is very fast. Without the algorithm option it would never terminate.
https://mariadb.com/kb/en/mariadb/alter-table/