What is the most efficient way to get the minimum of multiple columns on SQL Server 2005?
I tested the performance of all 3 methods, and here's what I found:
- 1 record: No noticeable difference
- 10 records: No noticeable difference
- 1,000 records: No noticeable difference
- 10,000 records:
UNION
subquery was a little slower. TheCASE WHEN
query is a little faster than theUNPIVOT
one. - 100,000 records:
UNION
subquery is significantly slower, butUNPIVOT
query becomes a little faster than theCASE WHEN
query - 500,000 records:
UNION
subquery still significantly slower, butUNPIVOT
becomes much faster than theCASE WHEN
query
So the end results seems to be
With smaller record sets there doesn't seem to be enough of a difference to matter. Use whatever is easiest to read and maintain.
Once you start getting into larger record sets, the
UNION ALL
subquery begins to perform poorly compared to the other two methods.The
CASE
statement performs the best up until a certain point (in my case, around 100k rows), and which point theUNPIVOT
query becomes the best-performing query
The actual number at which one query becomes better than another will probably change as a result of your hardware, database schema, data, and current server load, so be sure to test with your own system if you're concerned about performance.
I also ran some tests using Mikael's answer; however, it was slower than all 3 of the other methods tried here for most recordset sizes. The only exception was it did better than a the UNION ALL
query for very large recordset sizes. I like the fact it shows the column name in addition to the smallest value though.
I'm not a dba, so I may not have optimized my tests and missed something. I was testing with the actual live data, so that may have affected the results. I tried to account for that by running each query a few different times, but you never know. I would definitely be interested if someone wrote up a clean test of this and shared their results.
Don't know about what is fastest but you could try something like this.
declare @T table
(
Col1 int,
Col2 int,
Col3 int,
Col4 int,
Col5 int,
Col6 int
)
insert into @T values(1, 2, 3, 4, 5, 6)
insert into @T values(2, 3, 1, 4, 5, 6)
select T4.ColName, T4.ColValue
from @T as T1
cross apply (
select T3.ColValue, T3.ColName
from (
select row_number() over(order by T2.ColValue) as rn,
T2.ColValue,
T2.ColName
from (
select T1.Col1, 'Col1' union all
select T1.Col2, 'Col2' union all
select T1.Col3, 'Col3' union all
select T1.Col4, 'Col4' union all
select T1.Col5, 'Col5' union all
select T1.Col6, 'Col6'
) as T2(ColValue, ColName)
) as T3
where T3.rn = 1
) as T4
Result:
ColName ColValue
------- -----------
Col1 1
Col3 1
If you are not interested in what column has the min value you can use this instead.
declare @T table
(
Id int,
Col1 int,
Col2 int,
Col3 int,
Col4 int,
Col5 int,
Col6 int
)
insert into @T
select 1, 3, 4, 0, 2, 1, 5 union all
select 2, 2, 6, 10, 5, 7, 9 union all
select 3, 1, 1, 2, 3, 4, 5 union all
select 4, 9, 5, 4, 6, 8, 9
select T.Id, (select min(T1.ColValue)
from (
select T.Col1 union all
select T.Col2 union all
select T.Col3 union all
select T.Col4 union all
select T.Col5 union all
select T.Col6
) as T1(ColValue)
) as ColValue
from @T as T
A simplified unpivot query.
select Id, min(ColValue) as ColValue
from @T
unpivot (ColValue for Col in (Col1, Col2, Col3, Col4, Col5, Col6)) as U
group by Id
Add a persisted computed column that uses a CASE
statement to do the logic you need.
The minimum value will then always be efficiently available when you need to do a join (or whatever else) based on that value.
The value will be recomputed every time any of the source values change (INSERT
/UPDATE
/MERGE
). I'm not saying this is necessarily the best solution for the workload, I merely offer it as a solution, just like the other answers. Only the OP can determine which is best for the workload.