How to get minimum and maximum for grouped timestamps

Raster of 10-minute intervals

I suggest to group by a combination of "hour" and 10-minute interval:

SELECT hero
     , min(timestamp) AS start_time
     , CASE WHEN count(*) > 1 THEN max(timestamp) END AS end_time
FROM   tbl
GROUP  BY hero
     , date_trunc('hour', timestamp)
     , EXTRACT(MINUTE FROM timestamp)::int / 10
ORDER  BY 1, 2;  -- optional

Refer to the chapter Date/Time Functions and Operators in the manual.

EXTRACT(minute FROM timestamp) extracts the minute part of the time The expression. After the cast to integer (::int), integer division (/ 10) effectively rounds to 10-minute intervals (0 - 5).

The CASE expression only adds an end_time if more than one rows fall in the same 10-minute interval.

I advise not to use the "timestamp" as identifier. It's a reserved word in standard SQL and base data type in Postgres.

Groups defined by gaps of 10 or more minutes

If "groups" are defined by gaps of 10 minutes or more between rows of the same hero:

SELECT hero
     , count(*) AS ct  -- optional
     , min(timestamp) AS start_time
     , CASE WHEN count(*) > 1 THEN max(timestamp) END AS end_time
FROM  (
   SELECT hero, timestamp, count(step OR NULL) OVER (ORDER BY hero, timestamp) AS grp
   FROM  (
      SELECT *
           , lag(timestamp) OVER (PARTITION BY hero ORDER BY timestamp)
           < timestamp - interval '10 min' AS step
      FROM   tbl
      ) sub1
   ) sub2
GROUP  BY hero, grp;

Detailed explanation:

Select longest continuous sequence

db<>fiddle here
_{Old sqlfiddle}

If instead of rounding time intervals you would like to round integer numbers, let's say to "the nearest dozen", you would do the follwing operation:

number::integer / 12 * 12

Although dividing and multiplying by 12 seems to "just do nothing", it actually gets rid of the "fraction of a dozen" in your number. The trick lies in the fact that your division is an integer division, and the result is an integer number with no fractional part.

Try, for instance:

SELECT
     1 / 12 * 12 AS n1,
     2 / 12 * 12 AS n2,
    11 / 12 * 12 AS n11,
    12 / 12 * 12 AS n12,
    13 / 12 * 12 AS n13,
    23 / 12 * 12 AS n23,
    24 / 12 * 12 AS n24,
    25 / 12 * 12 AS n25,
    35 / 12 * 12 AS n35 ;

This is a standard way to "rounding down" numbers to bin them, given a certain bin size (12, in this case).

You can do the same with times, as sooun as you convert timestamps to integers (or bigints). You do that by representing timestamps as "number of seconds since a certain date and time". The "epoch" of a timestamp represents the number of seconds elapsed since a certain time and date (1st Jan 1970, 00:00:00). PostgreSQL (and I think, SQL in general) uses "EXTRACT (EPOCH FROM ts)" to get this value.

If you want to round these epochs to any interval (in your case 10 minutes = 60 * 10 seconds), you can make an integer division of this epoch by 600 (so, you get rid of the decimals) and multipy back by this 600. This way, you get rid of the "fractional part" of your 10 minute intervals.

The following code will give you the second version of your possible solutions (with a couple of extra rows for Wonder Woman):

CREATE TABLE hero_stamps (hero text, ts timestamp without time zone) ;

INSERT INTO 
    hero_stamps (hero, ts)
VALUES
    ('Batman', '2016-12-08 12:00:00'),
    ('Batman', '2016-12-08 12:07:00'),
    ('Batman', '2016-12-08 13:00:00'),
    ('Batman', '2016-12-08 14:00:00'),
    ('Wonder Woman', '2016-12-08 10:15:00'),
    ('Wonder Woman', '2016-12-08 10:18:00'),
    ('Wonder Woman', '2016-12-08 10:25:00'),
    ('Wonder Woman', '2016-12-08 11:30:00') ;

SELECT
    hero, 
    start_time, 
    (CASE WHEN start_time = end_time THEN NULL ELSE end_time END) AS end_time
FROM
(
    SELECT
        hero, 
        extract(epoch from ts)::bigint   /* timestamp converted to epoch */
           / (60 * 10)::integer          /* integer div */
           * (60 * 10)::integer          /* multiply back to have start time */
        AS epoch_start_time_of_interval,
        min(ts) AS start_time, 
        max(ts) AS end_time 
    FROM
        hero_stamps
    GROUP BY
        hero, epoch_start_time_of_interval
) AS s0
ORDER BY
    hero, start_time ;

The GROUP BY makes the "binning", the min(ts) and max(ts) gives you the maximum and minimum values, and the extra function in the outermost SELECT gives the nulls the way you want (and gets rid of the "epoch_start_time_of_interval" that you're not interested in).

This approach will work even if you have timestamps with the same "hour" and "minute" but different calendar days, because it does not ignore this information (it gets embedded in the "epoch").

A more compact version (giving the same results) would be

SELECT
    hero, 
    min(ts) AS start_time, 
    (CASE WHEN count(*) = 1 THEN NULL ELSE max(ts) END) AS end_time 
FROM
    hero_stamps
GROUP BY
    hero, 
    extract(epoch from ts)::bigint / 600
ORDER BY
    1, 2 ;

How to get minimum and maximum for grouped timestamps

Raster of 10-minute intervals

Groups defined by gaps of 10 or more minutes

Tags:

Datetime

Postgresql

Group By

Gaps And Islands

Postgresql 9.5

Related

Recent Posts