Average of last 4 runs of a product
Sample data:
CREATE TABLE dbo.Thing
(
Product integer NOT NULL,
TheDate date NOT NULL,
TheWeight decimal(5, 1) NOT NULL
);
INSERT dbo.Thing
(Product, TheDate, TheWeight)
VALUES
(900000, CONVERT(date, '20160101', 112), 20.0),
(900000, '20160303', 12.2),
(900000, '20160706', 15.0),
(900000, '20160707', 14.0),
(900000, '20160806', 3.0 ),
(900000, '20160808', 13.0 ),
(800000, '20160602', 14.0),
(800000, '20160603', 12.0),
(800000, '20160604', 12.0),
(800000, '20160605', 12.0),
(800000, '20160606', 12.0);
Solution:
The general idea here is to use the extended window aggregate functions available in SQL Server 2012 and later.
The only wrinkle is that AVG
does not return null over a window if it is smaller than the required four rows. To address that, we also calculate the number of rows found in the window using COUNT
. A simple CASE
expression can then be used to return a null if the window holds fewer than four rows:
SELECT
T.Product,
T.TheDate,
T.TheWeight,
[Average Weight] =
CASE
WHEN
4 > COUNT_BIG(*) OVER (
PARTITION BY T.Product
ORDER BY T.Product, T.TheDate
ROWS BETWEEN 4 PRECEDING
AND 1 PRECEDING
)
THEN NULL
ELSE
AVG(T.TheWeight) OVER (
PARTITION BY T.Product
ORDER BY T.Product, T.TheDate
ROWS BETWEEN 4 PRECEDING
AND 1 PRECEDING
)
END
FROM dbo.Thing AS T
ORDER BY
T.Product,
T.TheDate;
Run the query on Stack Exchange Data Explorer
Output:
More information:
Window Functions in SQL Server
Related question:
Date range rolling sum using window functions
This is a rolling average, which is a windowed function in SQL Server 2012 and newer. You could solve it like this:
SELECT Product, [Date], Weight,
(CASE WHEN _runningCount>=4
THEN _runningTotal/4.0
END) AS [Average weight]
FROM (
SELECT Product, [Date], Weight,
SUM(Weight) OVER (
PARTITION BY Product
ORDER BY [Date]
ROWS BETWEEN 4 PRECEDING AND 1 PRECEDING) AS _runningTotal,
SUM(1) OVER (
PARTITION BY Product
ORDER BY [Date]
ROWS BETWEEN 4 PRECEDING AND 1 PRECEDING) AS _runningCount
FROM theTable
) AS sub;
Here are the key points:
- The
OVER ()
clause describes that the aggregation (SUM in this case) takes place in a window. In our case, we want to aggregate the most recent previous four rows, for the current product (the partition), ordered by the date. SUM(1)
acts as a count.- For readability, I've put the two window functions in a subquery,
sub
. - Then, if
_runningCount
is 4 or more, we can divide the running total of the most recent four rows by 4, otherwise, returnNULL
.
If you want to return the running total at any given time (including the first few dates as well), you would change the CASE
to something like:
SELECT ...
_runningTotal/_runningCount AS [Average weight]