What is ROWS UNBOUNDED PRECEDING used for in Teradata?
It's the "frame" or "range" clause of window functions, which are part of the SQL standard and implemented in many databases, including Teradata.
A simple example would be to calculate the average amount in a frame of three days. I'm using PostgreSQL syntax for the example, but it will be the same for Teradata:
WITH data (t, a) AS (
VALUES(1, 1),
(2, 5),
(3, 3),
(4, 5),
(5, 4),
(6, 11)
)
SELECT t, a, avg(a) OVER (ORDER BY t ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
FROM data
ORDER BY t
... which yields:
t a avg
----------
1 1 3.00
2 5 3.00
3 3 4.33
4 5 4.00
5 4 6.67
6 11 7.50
As you can see, each average is calculated "over" an ordered frame consisting of the range between the previous row (1 preceding
) and the subsequent row (1 following
).
When you write ROWS UNBOUNDED PRECEDING
, then the frame's lower bound is simply infinite. This is useful when calculating sums (i.e. "running totals"), for instance:
WITH data (t, a) AS (
VALUES(1, 1),
(2, 5),
(3, 3),
(4, 5),
(5, 4),
(6, 11)
)
SELECT t, a, sum(a) OVER (ORDER BY t ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM data
ORDER BY t
yielding...
t a sum
---------
1 1 1
2 5 6
3 3 9
4 5 14
5 4 18
6 11 29
Here's another very good explanations of SQL window functions.
ROWS UNBOUNDED PRECEDING
is no Teradata-specific syntax, it's Standard SQL. Together with the ORDER BY
it defines the window on which the result is calculated.
Logically a Windowed Aggregate Function is newly calculated for each row within the PARTITION based on all ROWS between a starting row and an ending row.
Starting and ending rows might be fixed or relative to the current row based on the following keywords:
- CURRENT ROW, the current row
- UNBOUNDED PRECEDING, all rows before the current row -> fixed
- UNBOUNDED FOLLOWING, all rows after the current row -> fixed
- x PRECEDING, x rows before the current row -> relative
- y FOLLOWING, y rows after the current row -> relative
Possible kinds of calculation include:
- Both starting and ending row are fixed, the window consists of all rows of a partition, e.g. a Group Sum, i.e. aggregate plus detail rows
- One end is fixed, the other relative to current row, the number of rows increases or decreases, e.g. a Running Total, Remaining Sum
- Starting and ending row are relative to current row, the number of rows within a window is fixed, e.g. a Moving Average over n rows
So SUM(x) OVER (ORDER BY col ROWS UNBOUNDED PRECEDING)
results in a Cumulative Sum or Running Total
11 -> 11
2 -> 11 + 2 = 13
3 -> 13 + 3 (or 11+2+3) = 16
44 -> 16 + 44 (or 11+2+3+44) = 60