Get the distinct sum of a joined table column
To get the result without subquery, you have to resort to advanced window function trickery:
SELECT sum(count(*)) OVER () AS tickets_count
, sum(min(a.revenue)) OVER () AS atendees_revenue
FROM tickets t
JOIN attendees a ON a.id = t.attendee_id
GROUP BY t.attendee_id
LIMIT 1;
sqlfiddle
How does it work?
The key to understanding this is the sequence of events in the query:
aggregate functions -> window functions -> DISTINCT -> LIMIT
More details:
- Best way to get result count before LIMIT was applied
Step by step:
I
GROUP BY t.attendee_id
- which you would normally do in a subquery.Then I sum over the counts to get the total count of tickets. Not very efficient, but forced by your requirement. The aggregate function
count(*)
is wrapped in the window functionsum( ... ) OVER ()
to arrive at the not-so-common expression:sum(count(*)) OVER ()
.And sum the minimum revenue per attendee to get the sum without duplicates.
You could also use
max()
oravg()
instead ofmin()
to the same effect asrevenue
is guaranteed to be the same for every row per attendee.This could be simpler if
DISTINCT
was allowed in window functions, but PostgreSQL has not (yet) implemented this feature. Per documentation:Aggregate window functions, unlike normal aggregate functions, do not allow
DISTINCT
orORDER BY
to be used within the function argument list.Final step is to get a single row. This could be done with
DISTINCT
(SQL standard) since all rows are the same.LIMIT 1
will be faster, though. Or the SQL-standard formFETCH FIRST 1 ROWS ONLY
.
What about a simple division:
Select count(tickets.*) as tickets_count
, sum(attendees.revenue) / count(attendees.id) as atendees_revenue
from tickets LEFT OUTER JOIN attendees ON attendees.id =
tickets.attendee_id;
This should handle duplicates, triplicates, etcetera.