Can I rely on reading SQL Server Identity values in order?
When inserting a row, is there a window of opportunity between the generation of a new Identity value and the locking of the corresponding row key in the clustered index, where an external observer could see a newer Identity value inserted by a concurrent transaction?
Yes.
The allocation of identity values is independent of the containing user transaction. This is one reason that identity values are consumed even if the transaction is rolled back. The increment operation itself is protected by a latch to prevent corruption, but that is the extent of the protections.
In the specific circumstances of your implementation, the identity allocation (a call to CMEDSeqGen::GenerateNewValue
) is made before the user transaction for the insert is even made active (and so before any locks are taken).
By running two inserts concurrently with a debugger attached to allow me to freeze one thread just after the identity value is incremented and allocated, I was able to reproduce a scenario where:
- Session 1 acquires an identity value (3)
- Session 2 acquires an identity value (4)
- Session 2 performs its insert and commits (so row 4 is fully visible)
- Session 1 performs its insert and commits (row 3)
After step 3, a query using row_number under locking read committed returned the following:
In your implementation, this would result in Checkpoint ID 3 being skipped incorrectly.
The window of misopportunity is relatively small, but it exists. To give a more realistic scenario than having a debugger attached: An executing query thread can yield the scheduler after step 1 above. This allows a second thread to allocate an identity value, insert and commit, before the original thread resumes to perform its insert.
For clarity, there are no locks or other synchronization objects protecting the identity value after it is allocated and before it is used. For example, after step 1 above, a concurrent transaction can see the new identity value using T-SQL functions like IDENT_CURRENT
before the row exists in the table (even uncommitted).
Fundamentally, there are no more guarantees around identity values than documented:
- Each new value is generated based on the current seed & increment.
- Each new value for a particular transaction is different from other concurrent transactions on the table.
That really is it.
If strict transactional FIFO processing is required, you likely have no choice but to serialize manually. If the application has less oneous requirements, you have more options. The question isn't 100% clear in that regard. Nevertheless, you may find some useful information in Remus Rusanu's article Using Tables as Queues.
As Paul White answered absolutely correct there is a possibility for temporarily "skipped" identity rows. Here is just a small piece of code to reproduce this case for your own.
Create a database and a testtable:
create database IdentityTest
go
use IdentityTest
go
create table dbo.IdentityTest (ID int identity, c1 char(10))
create clustered index CI_dbo_IdentityTest_ID on dbo.IdentityTest(ID)
Perform concurrent inserts and selects on this table in a C# console program:
using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Threading;
namespace IdentityTest
{
class Program
{
static void Main(string[] args)
{
var insertThreads = new List<Thread>();
var selectThreads = new List<Thread>();
//start threads for infinite inserts
for (var i = 0; i < 100; i++)
{
insertThreads.Add(new Thread(InfiniteInsert));
insertThreads[i].Start();
}
//start threads for infinite selects
for (var i = 0; i < 10; i++)
{
selectThreads.Add(new Thread(InfiniteSelectAndCheck));
selectThreads[i].Start();
}
}
private static void InfiniteSelectAndCheck()
{
//infinite loop
while (true)
{
//read top 2 IDs
var cmd = new SqlCommand("select top(2) ID from dbo.IdentityTest order by ID desc")
{
Connection = new SqlConnection("Server=localhost;Database=IdentityTest;Integrated Security=SSPI;Application Name=IdentityTest")
};
try
{
cmd.Connection.Open();
var dr = cmd.ExecuteReader();
//read first row
dr.Read();
var row1 = int.Parse(dr["ID"].ToString());
//read second row
dr.Read();
var row2 = int.Parse(dr["ID"].ToString());
//write line if row1 and row are not consecutive
if (row1 - 1 != row2)
{
Console.WriteLine("row1=" + row1 + ", row2=" + row2);
}
}
finally
{
cmd.Connection.Close();
}
}
}
private static void InfiniteInsert()
{
//infinite loop
while (true)
{
var cmd = new SqlCommand("insert into dbo.IdentityTest (c1) values('a')")
{
Connection = new SqlConnection("Server=localhost;Database=IdentityTest;Integrated Security=SSPI;Application Name=IdentityTest")
};
try
{
cmd.Connection.Open();
cmd.ExecuteNonQuery();
}
finally
{
cmd.Connection.Close();
}
}
}
}
}
This console prints a line for every case when one of the reading threads "misses" an entry.
It is best to not expect the identities to be consecutive because there are many scenarios that can leave gaps. It is better to consider the identity like an abstract number and to not attach any business meaning to it.
Basically, gaps can happen if you roll back INSERT operations (or explicitly delete rows), and duplicates can occur if you set the table property IDENTITY_INSERT to ON.
Gaps can occur when:
- Records are deleted.
- An error has occurred when attempting to insert a new record (rolled back)
- An update/insert with explicit value (identity_insert option).
- Incremental value is more than 1.
- A transaction rolls back.
The identity property on a column has never guaranteed:
• Uniqueness
• Consecutive values within a transaction. If values must be consecutive then the transaction should use an exclusive lock on the table or use the SERIALIZABLE isolation level.
• Consecutive values after server restart.
• Reuse of values.
If you cannot use identity values because of this, create a separate table holding a current value and manage access to the table and number assignment with your application. This does have the potential of impacting performance.
https://msdn.microsoft.com/en-us/library/ms186775(v=sql.105).aspx
https://msdn.microsoft.com/en-us/library/ms186775(v=sql.110).aspx