Prevent InnoDB auto increment ON DUPLICATE KEY

You could set the innodb_autoinc_lock_mode config option to "0" for "traditional" auto-increment lock mode, which guarantees that all INSERT statements will assign consecutive values for AUTO_INCREMENT columns.

That said, you shouldn't depend on the auto-increment IDs being consecutive in your application. Their purpose is to provide unique identifiers.


This behavior is easily seen below with the default setting for innodb_autoinc_lock_mode = 1 (“consecutive” lock mode). Please also reference the fine manual page entitled AUTO_INCREMENT Handling in InnoDB. Changing this value will lower concurrency and performance with the setting = 0 for “Tranditional” lock mode as it uses a table-level AUTO-INC lock.

That said, the below is with the default setting = 1.

I am about to show you four examples of how easy it is to create gaps.

Example 1:

create table x
(   id int auto_increment primary key,
    someOtherUniqueKey varchar(50) not null,
    touched int not null,
    unique key(someOtherUniqueKey)
);
insert x(touched,someOtherUniqueKey) values (1,'dog') on duplicate key update touched=touched+1;
insert x(touched,someOtherUniqueKey) values (1,'dog') on duplicate key update touched=touched+1;
insert x(touched,someOtherUniqueKey) values (1,'cat') on duplicate key update touched=touched+1;

select * from x; 
+----+--------------------+---------+
| id | someOtherUniqueKey | touched |
+----+--------------------+---------+
|  1 | dog                |       2 |
|  3 | cat                |       1 |
+----+--------------------+---------+

The Gap (id=2 is skipped) is due to one of a handful of operations and quirks and nervous twitches of the INNODB engine. In its default high performance mode of concurrency, it performs range gap allocations for various queries sent to it. One had better have good reasons to change this setting, because doing so impacts performance. The sorts of things later versions of MySQL delivers to you, and you turn off due to Hyper Focusing on gaps in printout sheets (and bosses that say "Why do we have gaps").

In the case of an Insert on Duplicate Key Update (IODKU), it is assuming 1 new row and allocates a slot for it. Remember, concurrency, and your peers doing the same operations, perhaps hundreds concurrently. When the IODKU turns into an Update, well, there goes the use of that abandoned and never inserted row with id=2 for your connection and anyone else.

Example 2:

The same happens during Insert ... Select From as seen in This Answer of mine. In it I purposely use MyISAM due to reporting on counts, min, max, otherwise the range gap quirk would allocate and not fill all. And the numbers would look weird as that answer dealt with actual numbers. So the older engine (MyISAM) worked fine for tight non-gaps. Note that in that answer I was trying to do something fast and safe and that table could be converted to INNODB with ALTER TABLE after the fact. Had I done that example in INNODB to begin with, there would have been plenty of gaps (in the default mode). The reason the Insert ... Select From would have creates gaps in that Answer had I used INNODB was due to the uncertainty of the count, the mechanism that the engine chooses for safe (uncertain) range allocations. The INNODB engine knows the operation naturally, knows in has to create a safe pool of AUTO_INCREMENT id's, has concurrency (other users to think about), and gaps flourish. It's a fact. Try example 2 with the INNODB engine and see what you come up with for min, max, and count. Max won't equal count.

Examples 3 and 4:

There are various situations that cause INNODB Gaps documented on the Percona website as they stumble into more and document them. For instance, it occurs during failed inserts due to Foreign Key constraints seen in this 1452 Error image. Or a Primary Key error in this 1062 Error image.

Remember that the INNODB Gaps are there as a side-effect of system performance and a safe engine. Is that something one really wants to turn-off (Performance, Higher user statisfaction, higher concurrency, lack of table locks), for the sake of tighter id ranges? Ranges that have holes on deletes anyway. I would suggest not for my implementations, and the default with Performance is just fine.