Will altering an nvarchar(max) column to nvarchar(50) lock the table?
There's no way for us to fully answer your question. Depends on way too many factors, including the table definition, other columns in the table, if the data is off page, the T-SQL generated by your ORM, and so on. The documentation is pretty good though:
WITH ( ONLINE = ON | OFF) Applies to: SQL Server 2016 through SQL Server 2017 and Azure SQL Database.
Allows many alter column actions to be performed while the table remains available. Default is OFF. Alter column can be performed on line for column changes related to data type, column length or precision, nullability, sparseness, and collation.
Online alter column allows user created and auto statistics to reference the altered column for the duration of the ALTER COLUMN operation. This allows queries to perform as usual. At the end of the operation, auto-stats that reference the column are dropped and user-created stats are invalidated. The user must manually update user-generated statistics after the operation is completed. If the column is part of a filter expression for any statistics or indexes then you cannot perform an alter column operation.
While the online alter column operation is running, all operations that could take a dependency on the column (index, views, etc.) will block or fail with an appropriate error. This guarantees that online alter column will not fail because of dependencies introduced while the operation was running.
Online alter column has similar requirements, restrictions, and functionality as online index rebuild. This includes:
Online index rebuild is not supported when the table contains legacy LOB or filestream columns or when the table has a columnstore index. The same limitations apply for online alter column.
An existing column being altered requires twice the space allocation; for the original column and for the newly created hidden column.
The locking strategy during an alter column online operation follows the same locking pattern used for online index build.
In addition it's also pretty straightforward to test. I'm on SQL Server 2017 testing with the default read committed isolation level. First I'll create the table:
DROP TABLE IF EXISTS dbo.CONVERT_ME;
CREATE TABLE dbo.CONVERT_ME (
ID BIGINT NOT NULL,
OVERWEIGHT_COLUMN VARCHAR(MAX) NOT NULL,
PRIMARY KEY (ID)
);
INSERT INTO dbo.CONVERT_ME WITH (TABLOCK)
SELECT TOP (1500000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), REPLICATE('Z', 50)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
OPTION (MAXDOP 1);
Then I'll alter the column OFFLINE
, which is the default:
SET STATISTICS IO, TIME ON;
BEGIN TRANSACTION;
ALTER TABLE dbo.CONVERT_ME
ALTER COLUMN OVERWEIGHT_COLUMN VARCHAR(50) NOT NULL WITH (ONLINE = OFF);
My machine this takes about three seconds. The operation takes a schema modification lock during the entire transaction which prevents any other lock from being taken on it, including querying it with NOLOCK
.
Now I'll try it with WITH (ONLINE = ON)
. The operation is now eligible for parallelism, so it may run faster on your server. A schema modification lock is still taken on the object, but it occurs near the end instead of throughout the whole transaction. Here's an example snapshot of the lock table to show the schema modification locks held during most of the transaction:
Note that an object level schema modification lock has not yet been taken. Here is what it looks like after the operation has finished but before the transaction has committed:
A schema modification lock on the object, along with a few others, are taken briefly near the end.
So great, there should be no problem if you use WITH (ONLINE = ON)
, right? As I said before the time the operation takes is going to depend on a lot of factors. Consider a different table which has many more data pages:
DROP TABLE IF EXISTS dbo.CONVERT_ME;
CREATE TABLE dbo.CONVERT_ME (
ID BIGINT NOT NULL,
OTHER_COLUMN VARCHAR(8000) NOT NULL,
OVERWEIGHT_COLUMN VARCHAR(MAX) NOT NULL,
PRIMARY KEY (ID)
);
INSERT INTO dbo.CONVERT_ME WITH (TABLOCK)
SELECT TOP (1500000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), REPLICATE('Z', 4200), REPLICATE('Z', 50)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
OPTION (MAXDOP 1);
On my machine the same ALTER TABLE
now takes 39635 ms when run OFFLINE
and 117877 ms when run ONLINE
.
My advice to you is to read the documentation until you feel confident that you understand the locking considerations and to test with your real table's schema populated with real data before doing the switch in production.
PLEASE NOTE:
Not sure about Azure SQL Database, but in regular SQL Server the ONLINE
option is only for Enterprise Edition. Attempting to do this in another edition will result in the following error:
Msg 1712, Level 16, State 1, Line XXXXX
Online index operations can only be performed in Enterprise edition of SQL Server.
It will lock the table, the problem you've asked is "how long?" and the answer to that is (as always) "it depends". What else is running at the time? How fast can Azure (or any server) update the rows? The only way is to test it against a server with the same configuration that is running the same load as you'll be running at the time, and even then that will only give you an educated guess.
As an alternative, I have had success before in adding a new column to the table (alter table, add column varchar(50) null). Then copy the data across. This could be a single update statement (although then you'll possibly hit locking problems again) or you could batch it, you could even write an app that just does 1 record at a time. Once everything is copied across, drop the old column and rename the new column.
The risk is that between copying the data and dropping the old column that the data has changed. You could mitigate this with a trigger (not sure about triggers in Azure) to keep the data in sync between the two columns.