How to get minimum and maximum for grouped timestamps
Raster of 10-minute intervals
I suggest to group by a combination of "hour" and 10-minute interval:
SELECT hero
, min(timestamp) AS start_time
, CASE WHEN count(*) > 1 THEN max(timestamp) END AS end_time
FROM tbl
GROUP BY hero
, date_trunc('hour', timestamp)
, EXTRACT(MINUTE FROM timestamp)::int / 10
ORDER BY 1, 2; -- optional
Refer to the chapter Date/Time Functions and Operators in the manual.
EXTRACT(minute FROM timestamp)
extracts the minute part of the time The expression. After the cast to integer (::int
), integer division (/ 10
) effectively rounds to 10-minute intervals (0
- 5
).
The CASE
expression only adds an end_time
if more than one rows fall in the same 10-minute interval.
I advise not to use the "timestamp" as identifier. It's a reserved word in standard SQL and base data type in Postgres.
Groups defined by gaps of 10 or more minutes
If "groups" are defined by gaps of 10 minutes or more between rows of the same hero:
SELECT hero
, count(*) AS ct -- optional
, min(timestamp) AS start_time
, CASE WHEN count(*) > 1 THEN max(timestamp) END AS end_time
FROM (
SELECT hero, timestamp, count(step OR NULL) OVER (ORDER BY hero, timestamp) AS grp
FROM (
SELECT *
, lag(timestamp) OVER (PARTITION BY hero ORDER BY timestamp)
< timestamp - interval '10 min' AS step
FROM tbl
) sub1
) sub2
GROUP BY hero, grp;
Detailed explanation:
- Select longest continuous sequence
db<>fiddle here
Old sqlfiddle
If instead of rounding time intervals you would like to round integer numbers, let's say to "the nearest dozen", you would do the follwing operation:
number::integer / 12 * 12
Although dividing and multiplying by 12 seems to "just do nothing", it actually gets rid of the "fraction of a dozen" in your number. The trick lies in the fact that your division is an integer division, and the result is an integer number with no fractional part.
Try, for instance:
SELECT
1 / 12 * 12 AS n1,
2 / 12 * 12 AS n2,
11 / 12 * 12 AS n11,
12 / 12 * 12 AS n12,
13 / 12 * 12 AS n13,
23 / 12 * 12 AS n23,
24 / 12 * 12 AS n24,
25 / 12 * 12 AS n25,
35 / 12 * 12 AS n35 ;
This is a standard way to "rounding down" numbers to bin them, given a certain bin size (12, in this case).
You can do the same with times, as sooun as you convert timestamps to integers (or bigints). You do that by representing timestamps as "number of seconds since a certain date and time". The "epoch" of a timestamp represents the number of seconds elapsed since a certain time and date (1st Jan 1970, 00:00:00). PostgreSQL (and I think, SQL in general) uses "EXTRACT (EPOCH FROM ts)" to get this value.
If you want to round these epochs to any interval (in your case 10 minutes = 60 * 10 seconds), you can make an integer division of this epoch by 600 (so, you get rid of the decimals) and multipy back by this 600. This way, you get rid of the "fractional part" of your 10 minute intervals.
The following code will give you the second version of your possible solutions (with a couple of extra rows for Wonder Woman):
CREATE TABLE hero_stamps (hero text, ts timestamp without time zone) ;
INSERT INTO
hero_stamps (hero, ts)
VALUES
('Batman', '2016-12-08 12:00:00'),
('Batman', '2016-12-08 12:07:00'),
('Batman', '2016-12-08 13:00:00'),
('Batman', '2016-12-08 14:00:00'),
('Wonder Woman', '2016-12-08 10:15:00'),
('Wonder Woman', '2016-12-08 10:18:00'),
('Wonder Woman', '2016-12-08 10:25:00'),
('Wonder Woman', '2016-12-08 11:30:00') ;
SELECT
hero,
start_time,
(CASE WHEN start_time = end_time THEN NULL ELSE end_time END) AS end_time
FROM
(
SELECT
hero,
extract(epoch from ts)::bigint /* timestamp converted to epoch */
/ (60 * 10)::integer /* integer div */
* (60 * 10)::integer /* multiply back to have start time */
AS epoch_start_time_of_interval,
min(ts) AS start_time,
max(ts) AS end_time
FROM
hero_stamps
GROUP BY
hero, epoch_start_time_of_interval
) AS s0
ORDER BY
hero, start_time ;
The GROUP BY makes the "binning", the min(ts) and max(ts) gives you the maximum and minimum values, and the extra function in the outermost SELECT gives the nulls the way you want (and gets rid of the "epoch_start_time_of_interval" that you're not interested in).
This approach will work even if you have timestamps with the same "hour" and "minute" but different calendar days, because it does not ignore this information (it gets embedded in the "epoch").
A more compact version (giving the same results) would be
SELECT
hero,
min(ts) AS start_time,
(CASE WHEN count(*) = 1 THEN NULL ELSE max(ts) END) AS end_time
FROM
hero_stamps
GROUP BY
hero,
extract(epoch from ts)::bigint / 600
ORDER BY
1, 2 ;