In PostgreSQL, is there a type-safe first() aggregate function?

`DISTINCT ON()`

Just as a side note, this is precisely what DISTINCT ON() does (not to be confused with DISTINCT)

SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. The DISTINCT ON expressions are interpreted using the same rules as for ORDER BY (see above). Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example

So if you were to write,

SELECT myFirstAgg(z)
FROM foo
GROUP BY x,y;

It's effectively

SELECT DISTINCT ON(x,y) z
FROM foo;
-- ORDER BY z;

In that it takes the first z. There are two important differences,

You can also select other columns at no cost of further aggregation..

SELECT DISTINCT ON(x,y) z, k, r, t, v
FROM foo;
-- ORDER BY z, k, r, t, v;

Because there is no GROUP BY you can not use (real) aggregates with it.

CREATE TABLE foo AS
SELECT * FROM ( VALUES
  (1,2,3),
  (1,2,4),
  (1,2,5)
) AS t(x,y,z);

SELECT DISTINCT ON (x,y) z, sum(z)
FROM foo;

-- fails, as you should expect.
SELECT DISTINCT ON (x,y) z, sum(z)
FROM foo;

-- would not otherwise fail.
SELECT myFirstAgg(z), sum(z)
FROM foo
GROUP BY x,y;

Don't forget `ORDER BY`

Also, while I didn't bold it then I will now

Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. For example

Always use an ORDER BY with DISTINCT ON

Using an Ordered-Set Aggregate Function

I imagine a lot of people are looking for first_value, Ordered-Set Aggregate Functions. Just wanted to throw that out there. It would look like this, if the function existed:

SELECT a, b, first_value() WITHIN GROUP (ORDER BY z)    
FROM foo
GROUP BY a,b;

But, alas you can do this.

SELECT a, b, percentile_disc(0) WITHIN GROUP (ORDER BY z)   
FROM foo
GROUP BY a,b;

Not a direct answer to your question but you should try the first_value window function. It works like this:

CREATE TABLE test (
    id SERIAL NOT NULL PRIMARY KEY,
    cat TEXT,
    value VARCHAR(2)
    date TIMESTAMP WITH TIME ZONE

);

Then, if you want the first item in each cat (category) you will query like that:

SELECT
    cat,
    first_value(date) OVER (PARTITION BY cat ORDER BY date)
FROM
    test;

or:

SELECT
    cat,
    first_value(date) OVER w
FROM
    test
WINDOW w AS (PARTITION BY cat ORDER BY date);

Yay, I've found out an easy way with your case by using some features in PostgreSQL 9.4+

Let's see this example:

select  (array_agg(val ORDER BY i))[1] as first_value_orderby_i,
    (array_agg(val ORDER BY i DESC))[1] as last_value_orderby_i,
    (array_agg(val))[1] as last_value_all,
    (array_agg(val))[array_length(array_agg(val),1)] as last_value_all
   FROM (
        SELECT i, random() as val
        FROM generate_series(1,100) s(i)
        ORDER BY random()
    ) tmp_tbl

I hope it will help you at your case.

In PostgreSQL, is there a type-safe first() aggregate function?

`DISTINCT ON()`

Don't forget `ORDER BY`

Using an Ordered-Set Aggregate Function

Tags:

Postgresql

Greatest N Per Group

Aggregate

Functions

Related

Recent Posts

In PostgreSQL, is there a type-safe first() aggregate function?

DISTINCT ON()

Don't forget ORDER BY

Using an Ordered-Set Aggregate Function

Tags:

Postgresql

Greatest N Per Group

Aggregate

Functions

Related

`DISTINCT ON()`

Don't forget `ORDER BY`