As there is no unsigned int in SQL Server doesn't an Identity Seed of -2,147,483,648 make more sense for large tables?
There is nothing at all wrong with starting at -2,147,483,648 as far as SQL Server is concerned. Starting at 2,147,483,647 and counting backwards with IDENTITY(2147483647,-1)
is perfectly valid too.
Things that would make me be wary of doing so:
- It might confuse people who don't expect to see negative values in such positions. It is unusual enough that it could easily look like bug. If nothing else you might get sick of explaining it.
- If you ever pass the IDs to other services for any reason they may fail input validation do to that code not expecting to see negatives (an external system shouldn't care about your internal IDs like this, but there are many things that happen which shouldn't!). Furthermore, code in other layers of your application might use an unsigned int32 making the "down from int.max" option even more dangerous as you'll not hit the problem until dropping below 0.
- People sometimes use negative magic numbers to have special meaning, and this could cause collisions which lead to difficult to explain bugs. For instance shortening
WHERE (x.a<>y.a OR x.a IS NULL AND y.a IS NOT NULL OR x.a IS NOT NULL AND y.a IS NULL)
toWHERE ISNULL(x.a,-1)<>ISNULL(x.a,-2)
is something I've seen numerous times. Not likely for your PK as that'll never beNULL
but might happen in comparing FK values in other tables. I've even seen someone usesomevalue>-1
in place ofsomevalue IS NOT NULL
(apparently there is a circumstance where that is more efficient, though he never explained to my satisfaction what the circumstance might be!). There might be that and other odd shenanigans in code outside the DB too. - Most importantly: scale, particularly unexpected scale. Software and data that doesn't die early often outlives their creators' vision, and grows more over time accordingly. Unless you are very constrained by storage or RAM (perhaps in an embedded system) then plan for at least one order of magnitude higher than you expect.
If I expect hitting ~400,000,000 is ever going to be likely in the lifetime of the data, far before the difference between 2 or 4 000,000,000 is a consideration, I'm already going to go for a larger key so doubling up by using negative values isn't going to make much difference.
You don't want to be making a change to something core like a PK in four years time, but you even less want to do it in eight years time. In both cases if the design lasts that long you've long forgotten details and many other bits & bobs have started to depend on the key so the changes needing to be made grow massively, and even that single table is going to be massive work to change (unless most of the data gets deleted after a time) as it contains that many rows, then you have all those that refer to it with FKs to work on also.
No, you're not missing anything. Identities are meant to be meaningless internal-only surrogates which are for the computer to allocate and deal with, and the computer doesn't care whether you use positive or negative numbers.
However.. it's never quite that simple. There will always be a human somewhere trying to read these things, comprehend them, and reconcile them. We just find it harder to think about massive negative numbers than about massive positive ones. Don't make it harder on people trying to make your system work. Eventually these will leak - into reports, external references, screens. Try telling customer minus two billion they have to use a phone key pad to enter their customer ID to access their account!
If you think there is any chance of overflowing the identity in any likely future just use the larger datatype straight away. The considerations that drove Y2K problems are past. Disk is cheap. The extra memory used per query can be balanced against the peace of mind knowing the application is designed for a 50 year lifespan. You won't have to implement extra monitoring against the day 4 (or 8!) years from now when the last integer is allocated.
I know of one system which overflowed and the identities were decremented by 2.1Bn, effectively re-starting at int.Min. I've seen another use negative identities but fail because logging cast them to varchar(10) truncating the minus sign. I know another where the identity was defined as numeric(18,0), just to be sure. And I've seen another overflow without a plan in place, bringing the system down for some time. Because when you do reach int.max you have, by definition, four billion rows to deal with, and that doesn't make for a fun weekend.