"Merge" two rows in a Postgres table, with foreign keys

Assuming you just want to delete duplicates in books after merging duplicate authors.

BEGIN;
LOCK books, authors;

CREATE TEMP TABLE dupes ON COMMIT DROP AS (SELECT 2 AS dupe, 1 AS org);

DELETE FROM books b  -- delete duplicate books
USING  dupes d
WHERE  b.author_id = d.dupe
AND    EXISTS (
   SELECT 1
   FROM   books
   WHERE  title = b.title
   AND    author_id = d.org
   );

UPDATE books b  -- now we relink all remaining books
SET    author_id = d.org
FROM   dupes d   
WHERE  b.author_id = d.dupe;

DELETE FROM authors a  -- now we can delete all dupes
USING  dupes d
WHERE  a.id = d.dupe;

COMMIT;

The temp table could hold many rows to remove many dupes at once.

Repeat the first two steps for every tables referencing authors.id. If there are many I would create and execute the statements dynamically ...

I lock the tables explicitly to avoid concurrent disturbances.

Automation

A basic function could look like this:

CREATE OR REPLACE FUNCTION f_remove_dupe(_tbl text, _col text, _dupe int, _org int)
  RETURNS void  AS
$func$
DECLARE
   _ftbl text;
   _fcol text;
BEGIN
   FOR _ftbl, _fcol IN
      -- table and column name behind all referencing FKs
      SELECT c.conrelid::regclass::text, f.attname
      FROM   pg_attribute  a 
      JOIN   pg_constraint c ON a.attrelid = c.confrelid AND a.attnum = c.confkey[1]
      JOIN   pg_attribute  f ON f.attrelid = c.conrelid AND f.attnum = c.conkey[1]
      WHERE  a.attrelid = _tbl::regclass
      AND    a.attname  = _col
      AND    c.contype  = 'f'
   LOOP
      EXIT WHEN _ftbl IS NULL;  -- skip if not found
      EXECUTE format('
         UPDATE %1$s
         SET    %2$I = $2
         WHERE  %2$I = $1'
       , _ftbl, _fcol)
      USING _dupe, _org;
   END LOOP;

   EXECUTE format('
      DELETE FROM %I WHERE %I = $1'
    , _tbl, _col)
   USING _dupe;
END
$func$ LANGUAGE plpgsql;

Call:

SELECT f_remove_dupe('authors', 'id', 2, 1);

This simple version ...

  • ... only works for a single dupe.
  • ... ignores UNIQUE constraints in referencing tables.
  • ... assumes all FK constraints only use the one column, ignoring multi-column FKs
  • ... ignores possible interference from concurrent transactions.

Adapt to your requirements.

Related:

  • Find referenced field(s) of foreign key constraint
  • Can't execute dynamic DDL, argument is NULL
  • SQL injection in Postgres functions vs prepared queries