What is the best identifier for a UserID? 64 bit integers, UUID V5, or 64 character SHA256 UID?
Use integers. This is the simplest and most efficient way (especially for InnoDB).
That is
- don't use hashes or UUIDs or varchars in place of int
- don't use the natural key when you have foreign keys to it
Reasons?
- varchar comparisons require collation + case tests
- index architecture in InnoDB favours integer fields (ideal clustered index is narrow, numeric, monotonically increasing)
- hashes and GUIDs and varchar cause fragmentation
- wider data + indexes rows when not using integer
- 64 characters requires 64/66 bytes to store. 64 bit in requires 8 bytes
Other observations:
- Do you need 64 bit numbers?
UUIDs/GUIDs are desirable when you want multiple, independent, computers creating Globally Unique IDs.
But they suck as a PRIMARY KEY.
They are big -- 36 characters, worse if you default to utf8, which is overkill. The hex in them can be converted to BINARY(16). In InnoDB, the PK is appended to every secondary key, thereby multiplying the space cost of a bulky PK.
They are random -- Once you have more data than you can cache, you are usually hitting the disk for every fetch. This drops your performance to ~100 operations/sec. Just when you need scalability, it is taken away from you!
If all writes are done to a single Master, then use AUTO_INCREMENT: MEDIUMINT UNSIGNED up to 16M, INT UNSIGNED up to 4G, or BIGINT for the insanely optimistic.
As for the CPU performance of VARCHAR, BINARY, INT, etc -- that's not the issue. The issue is the bulkiness of the key because that leads to the cacheability of the INDEX, which leads to I/O being the bottleneck. If everything can be cached, don't worry about the key type and size. (OK, the datatypes in a JOIN must be the same, or close enough. Different collation, for example, destroys the usability of an index.)