PostgreSQL query with max and min date plus associated id per row

You can calculate this in one step with window functions:

CREATE OR REPLACE VIEW daily_trans AS
SELECT DISTINCT
       trans_date
     , first_value(trans_time) OVER w AS first_time
     , first_value(id)         OVER w AS first_id
     , last_value(trans_time)  OVER w AS last_time
     , last_value(id)          OVER w AS last_id
     , calculate_status(min(trans_time) OVER w
                      , max(trans_time) OVER w) AS status 
FROM   trans
WINDOW w AS (PARTITION BY trans_date ORDER BY trans_time, id
             ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER  BY 1;

I added id as secondary ORDER column in the window clause, to make the sort order stable in case of identical times per day.

If you are not familiar with window functions, be sure to read this chapter of the manual.

ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is a not so commonly used clause to adjust the window frame, because we want last_value() to always return the last value of the whole partition (the date), not the last value up to the current (including peers) according to the sort order.

How you to combine DISTINCT with window functions:

PostgreSQL: running count of rows for a query 'by minute'

Roughly, you want something like

Click to copy

select min_trans_time, min_trans.id as min_trans_time_id,
       max_trans_time, max_trans.id as max_trans_time_id
from (
    select   trans_date,
             max(trans_time) as max_trans_time, 
             min(trans_time) as min_trans_time,
    from     trans        
    group by trans_date) min_max_trans,
       trans as min_trans,
       trans as max_trans
where min_trans.trans_time = min_max_trans.min_trans_time
and   max_trans.trans_time = min_max_trans.max_trans_time

Which might not be stable if several transes share the same trans_time (that is, running the query on the same set of data might yield different results. One easy way of solving this, if it's a concern, is to pick the max or min id, for instance. Of course, this might bias the results :).

You can also use Window Functions if you are using PostgreSQL 8.4 or later; they will provide a clearer query (clear if you are familiar with window functions :), and they let you do stuff which is pretty hard to do with the above strategy (for instance, getting the 2nd-higher value instead of the maximum). However, in my short experience, they performed worse than the self-join approach I propose here. Erwin's answer contains a version of the query using window functions. It would be interesting to note which query executes better and/or can be optimized better (by adding indexes, etc.).

PostgreSQL query with max and min date plus associated id per row

Tags:

Postgresql

Aggregate

Window Functions

Related

Recent Posts