Select longest continuous sequence

Your description results in a table definition like this:

CREATE TABLE tbl (
   lap_id   serial PRIMARY KEY
 , lap_no   int NOT NULL
 , car_type enum NOT NULL
 , race_id  int NOT NULL  -- REFERENCES ...
 , UNIQUE(race_id, car_type, lap_no)
);

General solution for this class of problems

To get the longest sequence (1 result, longest of all, arbitrary pick if there are ties):

SELECT race_id, car_type, count(*) AS seq_len
FROM  (
   SELECT *, count(*) FILTER (WHERE step)
                      OVER (ORDER BY race_id, car_type, lap_no) AS grp
   FROM  (
      SELECT *, (lag(lap_no) OVER (PARTITION BY race_id, car_type ORDER BY lap_no) + 1)
                 IS DISTINCT FROM lap_no AS step
      FROM   tbl
      ) x
   ) y
GROUP  BY race_id, car_type, grp
ORDER  BY seq_len DESC
LIMIT  1;

count(*) FILTER (WHERE step) only counts TRUE (= step to next group), which results in a new number for every new group.

Related question on SO, one answer featuring a procedural solution with plpgsql:

GROUP BY and aggregate sequential numeric values

If the top requirement is performance, the plpgsql function is typically faster in this particular case because it can calculate the result in a single scan.

Faster for consecutive numbers

We can capitalize on the fact that consecutive lap_no define a sequence, for a much simpler and faster version:

SELECT race_id, car_type, count(*) AS seq_len
FROM  (
   SELECT race_id, car_type
        , row_number() OVER (PARTITION BY race_id, car_type ORDER BY lap_no) - lap_no AS grp
   FROM   tbl
   ) x
GROUP  BY race_id, car_type, grp
ORDER  BY seq_len DESC
LIMIT  1;

Consecutive laps end up in the same grp. Every missing lap results in a lower grp per partition.

This relies on (race_id, car_type, lap_no) being UNIQUE NOT NULL. NULL values or duplicates could break the logic.

Discussion of Jack's simpler alternative

@Jack's version effectively counts all laps (rows) where the previous lap_no in this race_id had the same car_type. That's simpler and faster and correct - as long as each car_type can only have one sequence per race_id.

But for a task that simple the query could be simpler, yet. It would follow logically that all lap_no per (car_type, race_id) must be in sequence, and we could just count the laps:

SELECT race_id, car_type, count(*) AS seq_len
FROM   tbl
GROUP  BY race_id, car_type
ORDER  BY seq_len DESC
LIMIT  1;

If, on the other hand, one car_type can have multiple separate sequences per race_id (and the question does not specify otherwise), Jack's version will fail.

Faster for a given race / car type

In reply to the comment / clarifications in the question: restricting the query to one given (race_id, car_type) will make it much faster, of course:

SELECT count(*) AS seq_len
FROM  (
   SELECT row_number() OVER (ORDER BY lap_no) - lap_no AS grp
   FROM   tbl
   WHERE  race_id = 1
   AND    car_type = 'red'
   ) x
GROUP  BY grp
ORDER  BY seq_len DESC
LIMIT  1;

db<>fiddle here
Old SQL Fiddle

Index

Key to top performance is a fitting index (except for the mentioned procedural solution working with a single sequential scan). A multicolumn index like this serves best:

CREATE INDEX tbl_mult_idx ON tbl (race_id, car_type, lap_no);

If your table has the UNIQUE constraint I assumed at the top, that is implemented with just this (unique) index internally, and you do not need to create another index.

create table tbl (lap_no int, car_type text, race_id int);

insert into tbl values (1,'red',1),(2,'red',1),(3,'red',1),(4,'red',1),
                       (1,'blue',1),(5,'red',1),(2,'blue',1),(1,'green',1);

select car_type, race_id, sum(case when lap_no=(prev+1) then 1 else 0 end)+1 seq_len
from ( select *, lag(lap_no) over (partition by car_type, race_id order by lap_no) prev 
       from tbl ) z
group by car_type, race_id
order by seq_len desc limit 1;

/*
|car_type|race_id|seq_len|
|:-------|------:|------:|
|red     |      1|      5|
*/

Select longest continuous sequence

General solution for this class of problems

Faster for consecutive numbers

Discussion of Jack's simpler alternative

Faster for a given race / car type

Index

Tags:

Postgresql

Window Functions

Gaps And Islands

Postgresql 9.0

Related

Recent Posts