Should I convert the frontend generated UUIDv6 to binary(16) for use in SQL Server as the clustered primary key?
I would consider another option: cluster on a surrogate key (like Id int IDENTITY(1,1) NOT NULL
), and make the app-generated UUID a nonclustered primary key.
This avoids the problems you called out with options 1 and 2, because you don't have to worry about the sorting / fragmentation issues in your base table, or potentially esoteric issues with clustering on binary
.
It will also save you some space (int is smaller than uniqueidentifier
and binary(16)
), as the clustering key is included in every nonclustered index (as well as other tables that reference this table via foreign key).
I've worked with BINARY data types in SQL Server, but can't remember if I ever created a table with a clustered index on a BINARY column. With that said, the only thing I know to watch out for is this note in the docs:
Conversions between any data type and the binary data types are not guaranteed to be the same between versions of SQL Server.
If you do go with Option 2 I recommend doing all conversions and manipulation of the BINARY value outside of SQL Server in application code.
reorder the bytes so that the most/least significant bytes are where SQL Server expects them to be
Do that.
Option 1 is what SQL Server does natively with NEWSEQUENTIALID(), as wikipedia puts it
NEWSEQUENTIALID function returns 128-bit identifiers similar to UUIDs which are committed to ascend in sequence until the next system reboot
In SQL Server a UNIQUEIDENTIFIER is just a 128-bit binary type. It's not required to conform to the structure of a UUID.
And if it's sequentially-generated* you can make it the clustered index key. Having a narrower clustered index key is normally not worth the cost of having an extra and unnecessaary index.
*It's not a big deal if the location in the sort order of the sequential values changes occasionally or is generated in a few different places by a few different application servers.