How to apply ORDER BY and LIMIT in combination with an aggregate function?
I am applying
order by p.name desc
in two places ... Is there a way to avoid that?
Yes. Aggregate with an ARRAY constructor in the lateral subquery directly:
SELECT c.identifier, p.persons
FROM class c
CROSS JOIN LATERAL (
SELECT ARRAY (
SELECT identifier
FROM person
WHERE class_identifier = c.identifier
ORDER BY name DESC
LIMIT 2
) AS persons
) p
ORDER BY c.identifier;
You also don't need GROUP BY
in the outer SELECT
this way. Shorter, cleaner, faster.
I replaced the LEFT JOIN
with a plain CROSS JOIN
since the ARRAY constructor always returns exactly 1 row. (Like you pointed out in a comment.)
db<>fiddle here.
Related:
- Preserve order of array elements after join
Order of rows in subqueries
To address your comment:
I learned that order of rows in a subquery is never guaranteed to be preserved in the outer query.
Well, no. While the SQL standard does not offer any guarantees, there are limited guarantees in Postgres. The manual:
This ordering is unspecified by default, but can be controlled by writing an
ORDER BY
clause within the aggregate call, as shown in Section 4.2.7. Alternatively, supplying the input values from a sorted subquery will usually work. For example:SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;
Beware that this approach can fail if the outer query level contains additional processing, such as a join, because that might cause the subquery's output to be reordered before the aggregate is computed.
If all you do in the next level is to aggregate rows, the order is positively guaranteed. Any yes, what we feed to the ARRAY constructor is a subquery, too. That's not the point. It would work with array_agg()
as well:
SELECT c.identifier, p.persons
FROM class c
CROSS JOIN LATERAL (
SELECT array_agg(identifier) AS persons
FROM (
SELECT identifier
FROM person
WHERE class_identifier = c.identifier
ORDER BY name DESC
LIMIT 2
) sub
) p
ORDER BY c.identifier;
But I expect the ARRAY constructor to be faster for the case. See:
- Why is array_agg() slower than the non-aggregate ARRAY() constructor?
- convert right side of join of many to many into array
Here's an alternative, but it is not any better than what you already have:
with enumeration (class_identifier, identifier, name, n) as (
select p.class_identifier, p.identifier, p.name
, row_number() over (partition by p.class_identifier
order by p.name desc)
from person as p
)
select c.identifier, array_agg(e.identifier order by e.n) as persons
from class as c
left join enumeration e
on c.identifier = e.class_identifier
where e.n <= 2
group by c.identifier
order by c.identifier;