Extract MIN and MAX values related to datetime values on Postgres 9+

There are various ways to do it. An index on columns used in order/filter/join (user_id and grade_date + grade) will play an important role on a large table. Performances must be tested with real data and table/index design.

Using a window function (`ROW_NUMBER()`):

SELECT f.user_id, f.grade, f.grade_date, l.grade, l.grade_date 
FROM (
    SELECT user_id, grade, grade_date
        , ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY grade_date) as n
    FROM data
) f
INNER JOIN (
    SELECT user_id, grade, grade_date
        , ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY grade_date DESC) as n
    FROM data
) l
ON f.user_id = l.user_id 
    AND f.n = 1 AND l.n = 1;

ROW_NUMBER gives each row a number from 1 to N by grade_date up and down and only the first one of each is kept (n=1).

Using subqueries:

SELECT  user_id
    , ( SELECT grade FROM data
        WHERE  user_id = d.user_id
        ORDER BY grade_date LIMIT 1
    )
    , ( SELECT grade_date FROM data
        WHERE  user_id = d.user_id
        ORDER BY grade_date LIMIT 1
    )
    , ( SELECT grade FROM data
        WHERE  user_id = d.user_id
        ORDER BY grade_date DESC LIMIT 1
    )
    , ( SELECT grade_date FROM data
        WHERE  user_id = d.user_id
        ORDER BY grade_date DESC LIMIT 1
    )
FROM (SELECT DISTINCT user_id FROM data) d
;

Each subquery only keep the first row and returns it.

Using MIN and MAX:

SELECT d.user_id, mn.grade, mn.grade_date, mx.grade, mx.grade_date
FROM (
    SELECT user_id, MIN(grade_date) as min_grade_date, MAX(grade_date) as max_grade_date
    FROM data
    GROUP BY user_id
) d
INNER JOIN data mn 
    ON mn.grade_date = d.min_grade_date AND mn.user_id = d.user_id 
INNER JOIN data mx 
    ON mx.grade_date = d.max_grade_date AND mx.user_id = d.user_id 
;

It may generate duplicate lines if a user has more than 1 grade on a first or last date.

See SQL Fiddle.

The most efficient way is probably using Windowed Aggregates (see @JulienVavasseur's answer).

This is just trying to avoid the join and minimize the Sort and Join WindowAgg steps:

SELECT user_id, first_grade, first_date, last_grade, last_date
FROM
 (   
   SELECT user_id, grade as first_grade, grade_date as first_date
      ,last_value(grade)         -- return the grade of the last row
       OVER (PARTITION BY user_id 
             ORDER BY grade_date -- same ORDER BY in all three functions
             ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as last_grade
      ,last_value(grade_date)    -- could be a MAX OVER, too, but this results in an additional WindowAgg step
       OVER (PARTITION BY user_id 
             ORDER BY grade_date
             ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) as last_date
      ,ROW_NUMBER()              -- needed to return the 1st row
       OVER (PARTITION BY user_id
             ORDER BY grade_date) as rn
   FROM data
 ) as dt
WHERE rn = 1;

See fiddle

After testing with 1 million row, here's the best way from @Julien Vavasseur combine with Index on "user_id, grade, grade_date" . We should consider using window function when having a lot of rows.

drop table t1 ;
create table t1 (user_id int, grade int, grade_date date ) ;

insert into t1
select round(a/1000),a,  current_date + a 
from generate_series(1, 1000000) a -- 1 million row

create index t1_idx1 on t1 using btree (user_id, grade, grade_date)

SELECT d.user_id, mn.grade as first_grade, mn.grade_date as first_date, 
 mx.grade as last_grade, mx.grade_date as last_date
FROM (
    SELECT user_id, MIN(grade_date) as min_grade_date, MAX(grade_date) as max_grade_date
    FROM t1
    GROUP BY user_id
) d
JOIN t1 mn  ON mn.grade_date = d.min_grade_date AND mn.user_id = d.user_id 
JOIN t1 mx  ON mx.grade_date = d.max_grade_date AND mx.user_id = d.user_id 
;

Before indexing: output=1001 rows -> 1400ms-1500 ms

After: output=1001 rows -> 550ms-800 ms

Extract MIN and MAX values related to datetime values on Postgres 9+

Using a window function (`ROW_NUMBER()`):

Using subqueries:

Using MIN and MAX:

Tags:

Postgresql

Postgresql 9.4

Related

Recent Posts

Extract MIN and MAX values related to datetime values on Postgres 9+

Using a window function (ROW_NUMBER()):

Using subqueries:

Using MIN and MAX:

Tags:

Postgresql

Postgresql 9.4

Related

Using a window function (`ROW_NUMBER()`):