Identity column re-seed: when it is necessary?
That is, when table row is deleted, it's PK must be reused in subsequent inserts.
What universe is your lecturer from??
That is grossly inefficient. If you try to do that, you will cut your performance prospects down by a factor of 10.
If you need gapless numbers for auditing reasons, build them explicitly, not directly from database tools. And never delete rows, but flag them as "deleted". This will add to the messiness of queries, since they will have to ignore such rows.
In MySQL, InnoDB requires the existence of a unique PRIMARY KEY
for each table. But that is the extent of the requirement. The key can even be a string.
Gaps are a convenience to the users and DBAs, not an inconvenience.
I can think of one case where gapless would be convenient -- chunking into groups of 100 rows at a time. But there is a simple workaround using LIMIT 100,1
.
Gaps have zero impact on performance. That includes non-numeric indexes. And non-unique indexes. And composite indexes.
Sure, you can run out of ids. I think I have seen it happen twice in nearly 2 decades of using MySQL. I may as well worry about being struck by an asteroid. It's low on my things-that-keep-me-awake-at-night list.
Gaps occur from (at least): INSERT IGNORE
, IODKU
, REPLACE
, DELETE
, ROLLBACK
(explicit, or due to crash), Multi-master replication (including Galera and Group Replication). Do you really want to come up with workarounds for those?!
Feel free to have us sanity-check anything else that lecturer says that is suspicious.
Reusing an identity value, should in general be discouraged. Either the value is used entirely internally, in which case it’s actual value is immaterial, or it is also used externally in which case reusing the value is very likely going to lead to a misidentification.
Take the obvious case of an invoice or purchase order number, these might easily come from an identity column and be exposed externally, but you would never want to reuse them for precisely that reason. Both refer to specific transactions which you would not want to get confused.
Resolving such issues can be a big hassle when companies merge or are acquired. Creating such problems on purpose? Not wise.
The reuse of PK id values has problems and generally should be avoided.
First, the implementation of auto_increment columns doesn't provide the guarantee of being gapless. Indeed gaps will occur if you rollback an insert on an auto increment column.
Secondly the gap ID may refer to existing data that hasn't been deleted (due to missing FK constrains). If they translate to member numbers communicated outside the system then that poses potential business identity risks.
Thirdly, bigint unsigned
won't run out of IDs for a significant time even given an extremely large insert rate.
The biggest pain with gaps is coming across auditors who insist its an audit flaw. For DBAs they know gaps exist and why.