PostgreSQL query with max and min date plus associated id per row
You can calculate this in one step with window functions:
CREATE OR REPLACE VIEW daily_trans AS
SELECT DISTINCT
trans_date
, first_value(trans_time) OVER w AS first_time
, first_value(id) OVER w AS first_id
, last_value(trans_time) OVER w AS last_time
, last_value(id) OVER w AS last_id
, calculate_status(min(trans_time) OVER w
, max(trans_time) OVER w) AS status
FROM trans
WINDOW w AS (PARTITION BY trans_date ORDER BY trans_time, id
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY 1;
I added id
as secondary ORDER
column in the window clause, to make the sort order stable in case of identical times per day.
If you are not familiar with window functions, be sure to read this chapter of the manual.
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
is a not so commonly used clause to adjust the window frame, because we want last_value()
to always return the last value of the whole partition (the date), not the last value up to the current (including peers) according to the sort order.
How you to combine DISTINCT
with window functions:
- PostgreSQL: running count of rows for a query 'by minute'
Roughly, you want something like
select min_trans_time, min_trans.id as min_trans_time_id,
max_trans_time, max_trans.id as max_trans_time_id
from (
select trans_date,
max(trans_time) as max_trans_time,
min(trans_time) as min_trans_time,
from trans
group by trans_date) min_max_trans,
trans as min_trans,
trans as max_trans
where min_trans.trans_time = min_max_trans.min_trans_time
and max_trans.trans_time = min_max_trans.max_trans_time
Which might not be stable if several trans
es share the same trans_time
(that is, running the query on the same set of data might yield different results. One easy way of solving this, if it's a concern, is to pick the max or min id, for instance. Of course, this might bias the results :).
You can also use Window Functions if you are using PostgreSQL 8.4 or later; they will provide a clearer query (clear if you are familiar with window functions :), and they let you do stuff which is pretty hard to do with the above strategy (for instance, getting the 2nd-higher value instead of the maximum). However, in my short experience, they performed worse than the self-join approach I propose here. Erwin's answer contains a version of the query using window functions. It would be interesting to note which query executes better and/or can be optimized better (by adding indexes, etc.).