How to Get Median Value From Row in SQL Server 2008 R2 Without Using Pivot and Unpivot Table
PIVOT and UNPIVOT are indeed not supported under compatibility level 80.
However, you can unpivot rows using a nested VALUES constructor. The resulting query in my case looks slightly unwieldy because of the double nesting, but it works in SQL Server 2008 with any supported compatibility level:
SELECT
id,
median =
(
SELECT
AVG(val)
FROM
(
SELECT
c = COUNT(*) OVER (),
rn = ROW_NUMBER() OVER (ORDER BY v.val ASC),
val = v.val
FROM
(
VALUES
(t.a), (t.b), (t.c), (t.d), (t.e), (t.f), (t.g),
(t.h), (t.i), (t.j), (t.k), (t.l), (t.m), (t.n),
(t.o), (t.p), (t.q), (t.r), (t.s), (t.t), (t.u)
) AS v (val)
WHERE
v.val IS NOT NULL
) AS derived
WHERE
rn IN ((c + 1) / 2, (c + 2) / 2)
)
FROM
temp AS t
;
The v.val IS NOT NULL
filtering is there to imitate the UNPIVOT behaviour more closely, because UNPIVOT automatically filters out NULL values.
An extra nesting was necessary because there was no other way to produce the count and row numbers and use them at the same nesting level.
So the innermost SELECT (SELECT ... FROM (VALUES ...)
) unpivots the row and provides the row count and row numbers, while the middle-tier level calculates the median.
It is possible to reduce nesting with the help of CROSS APPLY and grouping in the main query, like this:
SELECT
id,
median = AVG(x.val)
FROM
temp AS t
CROSS APPLY
(
SELECT
c = COUNT(*) OVER (),
rn = ROW_NUMBER() OVER (ORDER BY v.val ASC),
val = v.val
FROM
(
VALUES
(t.a), (t.b), (t.c), (t.d), (t.e), (t.f), (t.g),
(t.h), (t.i), (t.j), (t.k), (t.l), (t.m), (t.n),
(t.o), (t.p), (t.q), (t.r), (t.s), (t.t), (t.u)
) AS v (val)
WHERE
v.val IS NOT NULL
) AS x
WHERE
x.rn IN ((x.c + 1) / 2, (x.c + 2) / 2)
GROUP BY
t.id
;
For a live demonstration of both methods, please follow this dbfiddle.uk link.
A method that uses CROSS APPLY
(so I think it will work in a database with compatibility level 80). This will work correctly only if there are no nulls in these columns:
select t.id, z.val
from temp as t
cross apply
( select top (1) y.val
from
( select top (11) x.val
from
( values (a),(b),(c),(d),(e),(f),(g),(h),(i),(j),
(k),(l),(m),(n),(o),(p),(q),(r),(s),(t),(u)
) as x (val)
order by x.val
) as y
order by y.val desc
) as z ;
Tested in: dbfiddle.uk.
The most "inside" subquery (x
) does the unpivot from (21) columns to (21) rows:
( values (a),(b),(c),(d),(e),(f),(g),(h),(i),(j),
(k),(l),(m),(n),(o),(p),(q),(r),(s),(t),(u)
) as x (val)
Then the second one (y
) orders them by value and selects only half of them:
( select top (11) x.val
from
--
--
x (val)
order by x.val
) as y
and the third (z
) selects the last (of the 11 rows), to find the median.
A different method that takes care of nulls is the following. The self join and GROUP BY
is used to do a "rolling" count of values smaller and higher:
select t.id, w.val
from temp as t
cross apply
( select val = avg(val)
from
( select d.val,
cnt_less = count(case when y.val < d.val then 1 end),
cnt_more = count(case when y.val > d.val then 1 end),
cnt = count(y.val)
from
( select distinct val
from
( values (a),(b),(c),(d),(e),(f),(g),(h),(i),(j),
(k),(l),(m),(n),(o),(p),(q),(r),(s),(t),(u)
) as x (val)
where x.val is not null
) as d
cross join
( values (a),(b),(c),(d),(e),(f),(g),(h),(i),(j),
(k),(l),(m),(n),(o),(p),(q),(r),(s),(t),(u)
) as y (val)
group by d.val
) as z
where 2 * cnt_less <= cnt
and 2 * cnt_more <= cnt
) as w ;
It is an interesting question, so here is my solution
; with c (id, val) as
(
select id
, val = case columnname
when 'a' then a
when 'b' then b
when 'c' then c
when 'd' then d
when 'e' then e
when 'f' then f
when 'g' then g
when 'h' then h
when 'i' then i
when 'j' then j
when 'k' then k
when 'l' then l
when 'm' then m
when 'n' then n
when 'o' then o
when 'p' then p
when 'q' then q
when 'r' then r
when 's' then s
when 't' then t
when 'u' then u
end
from temp
cross join (
select columnname from (values ('a'), ('b'), ('c'), ('d'),('e')
, ('f'), ('g'), ('h'), ('i'),('j')
, ('k'), ('l'), ('m'), ('n'),('o')
, ('p'), ('q'), ('r'), ('s'),('t'), ('u')
) t(columnname)
) Col
)
, c2 as (
select id, rn = row_number() over (partition by id order by val), val, cnt=count(*) over (partition by id)
from c)
select id, median=avg(val) from c2
where rn in ((cnt+1)/2, (cnt+2)/2)
group by id
The first CTE c will unpivot the rows into columns, and the 2nd CTE c2 is to generate the ranking among ID groups so the outside the query can find out the median value