Detect duplicate items in recursive CTE

The word dep in the second query (after union) is ambiguous. In fact it is interpreted as the column of rdeps, not as an alias of objectdependencies.

Click to copy

with recursive rdeps as (
  select dep
  from objectdependencies dep
  where dep.dependson = 4 -- starting point
  union all
  select dep -- this means r.dep
  from objectdependencies dep
  join rdeps r
    on (r.dep).id = dep.dependson
) select (dep).id from rdeps;

This is why the query creates an endless loop. You can correct this by changing the alias:

Click to copy

with recursive rdeps as (
  select dep
  from objectdependencies dep
  where dep.dependson = 4 -- starting point
  union all
  select objectdep
  from objectdependencies objectdep
  join rdeps r
    on (r.dep).id = objectdep.dependson
) select (dep).id from rdeps;

 id 
----
  1
  2
  3
  1
  2
  1
(6 rows)

Or better, just by using columns, like the good Lord intended:

Click to copy

with recursive rdeps as (
    select id, dependson
    from objectdependencies
    where dependson = 4
union all
    select d.id, d.dependson
    from objectdependencies d
    join rdeps r
    on r.id = d.dependson
) 
select *
from rdeps;

The first query in the question is all you can do in plain sql as there is no communication between different (paralel) branches generated by a recursive query. In a functional approach you can use a temporary table as a store common to all branches. The function may look like this:

Click to copy

create or replace function rec_function(int)
returns void language plpgsql as $$
declare
    i int;
begin
    for i in
        select id
        from objectdependencies
        where dependson = $1
    loop
        if not exists(
            select from temp_table 
            where id = i)
        then
            insert into temp_table values(i);
            perform rec_function(i);
        end if;
    end loop;
end $$;

Usage:

Click to copy

create temp table temp_table(id int);

select rec_function(4);

select *
from temp_table;

I know is't a rather old question, but I'm a little amazed that no one suggested removing all from the union to eliminate the duplicates early. It is a rather simple way to prevent duplicates in the result of a recursive CTE, but it does have its caveats — such a result must include only real fields, i.e. no calculated on the go depth, path or whatever.

Using the sample data from the question with this query (reformatted a little from the original):

Click to copy

with recursive
rdeps as (
  select dep.id, dep.id as dependson
  from objectdependencies as dep
  where dep.dependson = 4 -- starting point
  union
  select self.id, dep.dependson
  from rdeps as self 
  join objectdependencies as dep on dep.dependson = self.id
)
select dependson from rdeps;

I get precisely 1, 2, and 3.

Moreover, this solution prevents endless loop in case of a cycle in dependencies. However, it does not detect it, as it doesn't and can't show that there is a cycle, it just prevents an endless loop.

Detect duplicate items in recursive CTE

Tags:

Sql

Postgresql

Common Table Expression

Related

Recent Posts