Storing NULL versus storing '' in a varchar column
Let's create three tables with a varchar column, two of them allowing NULL, one not.
CREATE TABLE dbo.x1(id int IDENTITY(1,1) PRIMARY KEY, field varchar(5) null);
CREATE TABLE dbo.x2(id int IDENTITY(1,1) PRIMARY KEY, field varchar(5) null);
CREATE TABLE dbo.x3(id int IDENTITY(1,1) PRIMARY KEY, field varchar(5) not null);
Populate them with 1,000,000 rows:
;WITH x(x) AS (SELECT 0 UNION ALL SELECT x+1 FROM x WHERE x < 1000000)
INSERT dbo.x1(field) SELECT NULL FROM x OPTION (MAXRECURSION 0);
INSERT dbo.x2(field) SELECT '' FROM dbo.x1;
INSERT dbo.x3(field) SELECT '' FROM dbo.x1;
Let's check the size:
SELECT COUNT(*)*8192/1024. FROM sys.dm_db_database_page_allocations(DB_ID(),
OBJECT_ID(N'dbo.x1'), 1, NULL, 'DETAILED');
SELECT COUNT(*)*8192/1024. FROM sys.dm_db_database_page_allocations(DB_ID(),
OBJECT_ID(N'dbo.x2'), 1, NULL, 'DETAILED');
SELECT COUNT(*)*8192/1024. FROM sys.dm_db_database_page_allocations(DB_ID(),
OBJECT_ID(N'dbo.x3'), 1, NULL, 'DETAILED');
Results:
12,928 KB
12,936 KB
12,936 KB
So it looks like for 1,000,000 rows, choosing NULL
over ''
saves a whopping 8 KB (and this isn't even reflected in sp_spaceused
, because that one page you saved is still reserved, just not allocated).
Repeated for a heap (again, have to do multiple tests since we're guessing about your actual table structure):
12,872 KB
12,872 KB
12,928 KB
So, negligible, as I suggested, even extrapolating over 120,000,000 rows, the biggest possible difference (once more, depending on your schema) would be 960KB on a proper table, and 6.7MB on a heap. If your server is so tight on disk space that 6.7MB is going to drive decisions, you might consider how much an additional disk would cost when compared to the time you're spending investigating this.
IMHO, there are far more important reasons between deciding to use NULLs or not to represent "no data." A good question with lots of opinions and commentary is here:
- Why shouldn't we allow NULLs?
See this article, which explains how SQL stores NULLs.
Basically, a variable width column (varchar) stores a bitmap that indicates null or not null. If it's null, then zero bytes are allocated for the varchar field and the bit gets flipped.
For fixed width columns (char), the entire field is still allocated, with no data stored in it. So a 10-yte char field will allocate 10 bytes, NULL or not.
That article does an insert with data, with NULL, and with empty string. It then polls the page size to see what's going on internally.
For both Null and Empty string, 0 bytes are allocated for varchar fields.
For at least the standard (FixedVar) record format it does not make any difference to how much space the table consumes (it might make a marginal difference to indexes as discussed later).
Both a null varchar
and an empty string are stored in exactly the same way. The only way they are distinguished is whether there was a 1
or 0
in the null bitmap. They both take zero length in the variable length column data section and both can also avoid taking up two bytes in the variable column offset array if they are not followed by any columns that do contain data.
One of the comments says
For the table where NULLs are not allowed, you'd want to define the column as NOT NULL, for a number of reasons, not least of which is that removes the requirement for a null bitmap for that columnn
This is not true for datapages. See Myth #6b: The null bitmap only contains bits for nullable columns.
There is a slight difference for indexes in that if all the columns participating in an index are not nullable the null bitmap is omitted.
However the difference is negligible and you should be choosing the option that gives you the desired semantics.