What are different ways to replace ISNULL() in a WHERE clause that uses only literal values?
Answer section
There are various ways to rewrite this using different T-SQL constructs. We'll look at the pros and cons and do an overall comparison below.
First up: Using OR
SELECT COUNT(*)
FROM dbo.Users AS u
WHERE u.Age < 18
OR u.Age IS NULL;
Using OR
gives us a more efficient Seek plan, which reads the exact number of rows we need, however it adds what the technical world calls a whole mess of malarkey
to the query plan.
Also note that the Seek is executed twice here, which really should be more obvious from the graphical operator:
Table 'Users'. Scan count 2, logical reads 8233, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 469 ms, elapsed time = 473 ms.
Second up: Using derived tables with UNION ALL
Our query can also be rewritten like this
SELECT SUM(Records)
FROM
(
SELECT COUNT(Id)
FROM dbo.Users AS u
WHERE u.Age < 18
UNION ALL
SELECT COUNT(Id)
FROM dbo.Users AS u
WHERE u.Age IS NULL
) x (Records);
This yields the same type of plan, with far less malarkey, and a more apparent degree of honesty about how many times the index was seeked (sought?) into.
It does the same amount of reads (8233) as the OR
query, but shaves about 100ms of CPU time off.
CPU time = 313 ms, elapsed time = 315 ms.
However, you have to be really careful here, because if this plan attempts to go parallel, the two separate COUNT
operations will be serialized, because they're each considered a global scalar aggregate. If we force a parallel plan using Trace Flag 8649, the problem becomes obvious.
SELECT SUM(Records)
FROM
(
SELECT COUNT(Id)
FROM dbo.Users AS u
WHERE u.Age < 18
UNION ALL
SELECT COUNT(Id)
FROM dbo.Users AS u
WHERE u.Age IS NULL
) x (Records)
OPTION(QUERYTRACEON 8649);
This can be avoided by changing our query slightly.
SELECT SUM(Records)
FROM
(
SELECT 1
FROM dbo.Users AS u
WHERE u.Age < 18
UNION ALL
SELECT 1
FROM dbo.Users AS u
WHERE u.Age IS NULL
) x (Records)
OPTION(QUERYTRACEON 8649);
Now both nodes performing a Seek are fully parallelized until we hit the concatenation operator.
For what it's worth, the fully parallel version has some good benefit. At the cost of about 100 more reads, and about 90ms of additional CPU time, the elapsed time shrinks to 93ms.
Table 'Users'. Scan count 12, logical reads 8317, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 500 ms, elapsed time = 93 ms.
What about CROSS APPLY?
No answer is complete without the magic of CROSS APPLY
!
Unfortunately, we run into more problems with COUNT
.
SELECT SUM(Records)
FROM dbo.Users AS u
CROSS APPLY
(
SELECT COUNT(Id)
FROM dbo.Users AS u2
WHERE u2.Id = u.Id
AND u2.Age < 18
UNION ALL
SELECT COUNT(Id)
FROM dbo.Users AS u2
WHERE u2.Id = u.Id
AND u2.Age IS NULL
) x (Records);
This plan is horrible. This is the kind of plan you end up with when you show up last to St. Patrick's Day. Though nicely parallel, for some reason it's scanning the PK/CX. Ew. The plan has a cost of 2198 query bucks.
Table 'Users'. Scan count 7, logical reads 31676233, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 29532 ms, elapsed time = 5828 ms.
Which is a weird choice, because if we force it to use the nonclustered index, the cost drops rather significantly to 1798 query bucks.
SELECT SUM(Records)
FROM dbo.Users AS u
CROSS APPLY
(
SELECT COUNT(Id)
FROM dbo.Users AS u2 WITH (INDEX(ix_Id_Age))
WHERE u2.Id = u.Id
AND u2.Age < 18
UNION ALL
SELECT COUNT(Id)
FROM dbo.Users AS u2 WITH (INDEX(ix_Id_Age))
WHERE u2.Id = u.Id
AND u2.Age IS NULL
) x (Records);
Hey, seeks! Check you out over there. Also note that with the magic of CROSS APPLY
, we don't need to do anything goofy to have a mostly fully parallel plan.
Table 'Users'. Scan count 5277838, logical reads 31685303, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 27625 ms, elapsed time = 4909 ms.
Cross apply does end up faring better without the COUNT
stuff in there.
SELECT SUM(Records)
FROM dbo.Users AS u
CROSS APPLY
(
SELECT 1
FROM dbo.Users AS u2
WHERE u2.Id = u.Id
AND u2.Age < 18
UNION ALL
SELECT 1
FROM dbo.Users AS u2
WHERE u2.Id = u.Id
AND u2.Age IS NULL
) x (Records);
The plan looks good, but the reads and CPU aren't an improvement.
Table 'Users'. Scan count 20, logical reads 17564, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 4844 ms, elapsed time = 863 ms.
Rewriting the cross apply to be a derived join results in the exact same everything. I'm not going to re-post the query plan and stats info -- they really didn't change.
SELECT COUNT(u.Id)
FROM dbo.Users AS u
JOIN
(
SELECT u.Id
FROM dbo.Users AS u
WHERE u.Age < 18
UNION ALL
SELECT u.Id
FROM dbo.Users AS u
WHERE u.Age IS NULL
) x ON x.Id = u.Id;
Relational Algebra: To be thorough, and to keep Joe Celko from haunting my dreams, we need to at least try some weird relational stuff. Here goes nothin'!
An attempt with INTERSECT
SELECT COUNT(*)
FROM dbo.Users AS u
WHERE NOT EXISTS ( SELECT u.Age WHERE u.Age >= 18
INTERSECT
SELECT u.Age WHERE u.Age IS NOT NULL );
Table 'Users'. Scan count 1, logical reads 9157, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 1094 ms, elapsed time = 1090 ms.
And here's an attempt with EXCEPT
SELECT COUNT(*)
FROM dbo.Users AS u
WHERE NOT EXISTS ( SELECT u.Age WHERE u.Age >= 18
EXCEPT
SELECT u.Age WHERE u.Age IS NULL);
Table 'Users'. Scan count 7, logical reads 9247, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 2126 ms, elapsed time = 376 ms.
There may be other ways to write these, but I'll leave that up to people who perhaps use EXCEPT
and INTERSECT
more often than I do.
If you really just need a count
I use COUNT
in my queries as a bit of shorthand (read: I'm too lazy to come up with more involved scenarios sometimes). If you just need a count, you can use a CASE
expression to do just about the same thing.
SELECT SUM(CASE WHEN u.Age < 18 THEN 1
WHEN u.Age IS NULL THEN 1
ELSE 0 END)
FROM dbo.Users AS u
SELECT SUM(CASE WHEN u.Age < 18 OR u.Age IS NULL THEN 1
ELSE 0 END)
FROM dbo.Users AS u
These both get the same plan and have the same CPU and read characteristics.
Table 'Users'. Scan count 1, logical reads 9157, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 719 ms, elapsed time = 719 ms.
The winner? In my tests, the forced parallel plan with SUM over a derived table performed the best. And yeah, many of these queries could have been assisted by adding a couple filtered indexes to account for both predicates, but I wanted to leave some experimentation to others.
SELECT SUM(Records)
FROM
(
SELECT 1
FROM dbo.Users AS u
WHERE u.Age < 18
UNION ALL
SELECT 1
FROM dbo.Users AS u
WHERE u.Age IS NULL
) x (Records)
OPTION(QUERYTRACEON 8649);
Thanks!
I wasn't game to restore a 110 GB database for just one table so I created my own data. The age distributions should match what's on Stack Overflow but obviously the table itself won't match. I don't think that it's too much of an issue because the queries are going to hit indexes anyway. I'm testing on a 4 CPU computer with SQL Server 2016 SP1. One thing to note is that for queries that finish this quickly it's important not to include the actual execution plan. That can slow things down quite a bit.
I started by going through some of the solutions in Erik's excellent answer. For this one:
SELECT SUM(Records)
FROM
(
SELECT COUNT(Id)
FROM dbo.Users AS u
WHERE u.Age < 18
UNION ALL
SELECT COUNT(Id)
FROM dbo.Users AS u
WHERE u.Age IS NULL
) x (Records);
I got the following results from sys.dm_exec_sessions over 10 trials (the query naturally went parallel for me):
╔══════════╦════════════════════╦═══════════════╗
║ cpu_time ║ total_elapsed_time ║ logical_reads ║
╠══════════╬════════════════════╬═══════════════╣
║ 3532 ║ 975 ║ 60830 ║
╚══════════╩════════════════════╩═══════════════╝
The query that worked better for Erik actually performed worse on my machine:
SELECT SUM(Records)
FROM
(
SELECT 1
FROM dbo.Users AS u
WHERE u.Age < 18
UNION ALL
SELECT 1
FROM dbo.Users AS u
WHERE u.Age IS NULL
) x (Records)
OPTION(QUERYTRACEON 8649);
Results from 10 trials:
╔══════════╦════════════════════╦═══════════════╗
║ cpu_time ║ total_elapsed_time ║ logical_reads ║
╠══════════╬════════════════════╬═══════════════╣
║ 5704 ║ 1636 ║ 60850 ║
╚══════════╩════════════════════╩═══════════════╝
I'm not immediately able to explain why it's that bad, but it's not clear why we want to force nearly every operator in the query plan to go parallel. In the original plan we have a serial zone that finds all rows with AGE < 18
. There are only a few thousand rows. On my machine I get 9 logical reads for that part of the query and 9 ms of reported CPU time and elapsed time. There's also a serial zone for the global aggregate for the rows with AGE IS NULL
but that only processes one row per DOP. On my machine this is just four rows.
My takeaway is that it's most important to optimize the part of the query that finds rows with a NULL
for Age
because there are millions of those rows. I wasn't able to create an index with less pages that covered the data than a simple page-compressed one on the column. I assume that there's a minimum index size per row or that a lot of the index space cannot be avoided with the tricks that I tried. So if we're stuck with about the same number of logical reads to get the data then the only way to make it faster is to make the query more parallel, but this needs to be done in a different way than Erik's query that used TF 8649. In the query above we have a ratio of 3.62 for CPU time to elapsed time which is pretty good. The ideal would be a ratio of 4.0 on my machine.
One possible area of improvement is to divide the work more evenly among threads. In the screenshot below we can see that one of my CPUs decided to take a little break:
Index scan is one of the few operators that can be implemented in parallel and we can't do anything about how the rows are distributed to threads. There's an element of chance to it as well but pretty consistently I saw one underworked thread. One way to work around this is to do parallelism the hard way: on the inner part of a nested loop join. Anything on the inner part of a nested loop will be implemented in a serial way but many serial threads can run concurrently. As long as we get a favorable parallel distribution method (such as round robin), we can control exactly how many rows are sent to each thread.
I'm running queries with DOP 4 so I need to evenly divide the NULL
rows in the table into four buckets. One way to do this is to create a bunch of indexes on computed columns:
ALTER TABLE dbo.Users
ADD Compute_bucket_0 AS (CASE WHEN Age IS NULL AND Id % 4 = 0 THEN 1 ELSE NULL END),
Compute_bucket_1 AS (CASE WHEN Age IS NULL AND Id % 4 = 1 THEN 1 ELSE NULL END),
Compute_bucket_2 AS (CASE WHEN Age IS NULL AND Id % 4 = 2 THEN 1 ELSE NULL END),
Compute_bucket_3 AS (CASE WHEN Age IS NULL AND Id % 4 = 3 THEN 1 ELSE NULL END);
CREATE INDEX IX_Compute_bucket_0 ON dbo.Users (Compute_bucket_0) WITH (DATA_COMPRESSION = PAGE);
CREATE INDEX IX_Compute_bucket_1 ON dbo.Users (Compute_bucket_1) WITH (DATA_COMPRESSION = PAGE);
CREATE INDEX IX_Compute_bucket_2 ON dbo.Users (Compute_bucket_2) WITH (DATA_COMPRESSION = PAGE);
CREATE INDEX IX_Compute_bucket_3 ON dbo.Users (Compute_bucket_3) WITH (DATA_COMPRESSION = PAGE);
I'm not quite sure why four separate indexes is a little faster than one index but that's one what I found in my testing.
To get a parallel nested loop plan I'm going to use the undocumented trace flag 8649. I'm also going to write the code a little strangely to encourage the optimizer not to process more rows than necessary. Below is one implementation which appears to work well:
SELECT SUM(t.cnt) + (SELECT COUNT(*) FROM dbo.Users AS u WHERE u.Age < 18)
FROM
(VALUES (0), (1), (2), (3)) v(x)
CROSS APPLY
(
SELECT COUNT(*) cnt
FROM dbo.Users
WHERE Compute_bucket_0 = CASE WHEN v.x = 0 THEN 1 ELSE NULL END
UNION ALL
SELECT COUNT(*) cnt
FROM dbo.Users
WHERE Compute_bucket_1 = CASE WHEN v.x = 1 THEN 1 ELSE NULL END
UNION ALL
SELECT COUNT(*) cnt
FROM dbo.Users
WHERE Compute_bucket_2 = CASE WHEN v.x = 2 THEN 1 ELSE NULL END
UNION ALL
SELECT COUNT(*) cnt
FROM dbo.Users
WHERE Compute_bucket_3 = CASE WHEN v.x = 3 THEN 1 ELSE NULL END
) t
OPTION (QUERYTRACEON 8649);
The results from ten trials:
╔══════════╦════════════════════╦═══════════════╗
║ cpu_time ║ total_elapsed_time ║ logical_reads ║
╠══════════╬════════════════════╬═══════════════╣
║ 3093 ║ 803 ║ 62008 ║
╚══════════╩════════════════════╩═══════════════╝
With that query we have a CPU to elapsed time ratio of 3.85! We shaved off 17 ms from the runtime and it only took 4 computed columns and indexes to do it! Each thread processes very close to the same number of rows overall because each index has very close to the same number of rows and each thread only scans one index:
On a final note we can also hit the easy button and add a nonclustered CCI to the Age
column:
CREATE NONCLUSTERED COLUMNSTORE INDEX X_NCCI ON dbo.Users (Age);
The following query finishes in 3 ms on my machine:
SELECT COUNT(*)
FROM dbo.Users AS u
WHERE u.Age < 18 OR u.Age IS NULL;
That's going to be tough to beat.
Although I don't have a local copy of the Stack Overflow database, I was able to try out a couple of queries. My thought was to get a count of users from a system catalog view (as opposed to directly getting a count of rows from the underlying table). Then get a count of rows that do (or maybe do not) match Erik's criteria, and do some simple math.
I used the Stack Exchange Data Explorer (Along with SET STATISTICS TIME ON;
and SET STATISTICS IO ON;
) to test the queries. For a point of reference, here are some queries and the CPU/IO statistics:
QUERY 1
--Erik's query From initial question.
SELECT COUNT(*)
FROM dbo.Users AS u
WHERE ISNULL(u.Age, 17) < 18;
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms. (1 row(s) returned)
Table 'Users'. Scan count 17, logical reads 201567, physical reads 0, read-ahead reads 2740, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 1829 ms, elapsed time = 296 ms.
QUERY 2
--Erik's "OR" query.
SELECT COUNT(*)
FROM dbo.Users AS u
WHERE u.Age < 18
OR u.Age IS NULL;
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms. (1 row(s) returned)
Table 'Users'. Scan count 17, logical reads 201567, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 2500 ms, elapsed time = 147 ms.
QUERY 3
--Erik's derived tables/UNION ALL query.
SELECT SUM(Records)
FROM
(
SELECT COUNT(Id)
FROM dbo.Users AS u
WHERE u.Age < 18
UNION ALL
SELECT COUNT(Id)
FROM dbo.Users AS u
WHERE u.Age IS NULL
) x (Records);
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms. (1 row(s) returned)
Table 'Users'. Scan count 34, logical reads 403134, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 3156 ms, elapsed time = 215 ms.
1st Attempt
This was slower than all of Erik's queries I listed here...at least in terms of elapsed time.
SELECT SUM(p.Rows) -
(
SELECT COUNT(*)
FROM dbo.Users AS u
WHERE u.Age >= 18
)
FROM sys.objects o
JOIN sys.partitions p
ON p.object_id = o.object_id
WHERE p.index_id < 2
AND o.name = 'Users'
AND SCHEMA_NAME(o.schema_id) = 'dbo'
GROUP BY o.schema_id, o.name
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms. (1 row(s) returned)
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'sysrowsets'. Scan count 2, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'sysschobjs'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'Users'. Scan count 1, logical reads 201567, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 593 ms, elapsed time = 598 ms.
2nd Attempt
Here I opted for a variable to store the total number of users (instead of a sub-query). The scan count increased from 1 to 17 compared to the 1st attempt. Logical reads stayed the same. However, elapsed time dropped considerably.
DECLARE @Total INT;
SELECT @Total = SUM(p.Rows)
FROM sys.objects o
JOIN sys.partitions p
ON p.object_id = o.object_id
WHERE p.index_id < 2
AND o.name = 'Users'
AND SCHEMA_NAME(o.schema_id) = 'dbo'
GROUP BY o.schema_id, o.name
SELECT @Total - COUNT(*)
FROM dbo.Users AS u
WHERE u.Age >= 18
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms. Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'sysrowsets'. Scan count 2, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table 'sysschobjs'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 0 ms, elapsed time = 1 ms. (1 row(s) returned)
Table 'Users'. Scan count 17, logical reads 201567, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times: CPU time = 1471 ms, elapsed time = 98 ms.
Other Notes: DBCC TRACEON is not permitted on Stack Exchange Data Explorer, as noted below:
User 'STACKEXCHANGE\svc_sede' does not have permission to run DBCC TRACEON.