"Merge" two rows in a Postgres table, with foreign keys
Assuming you just want to delete duplicates in books
after merging duplicate authors.
BEGIN;
LOCK books, authors;
CREATE TEMP TABLE dupes ON COMMIT DROP AS (SELECT 2 AS dupe, 1 AS org);
DELETE FROM books b -- delete duplicate books
USING dupes d
WHERE b.author_id = d.dupe
AND EXISTS (
SELECT 1
FROM books
WHERE title = b.title
AND author_id = d.org
);
UPDATE books b -- now we relink all remaining books
SET author_id = d.org
FROM dupes d
WHERE b.author_id = d.dupe;
DELETE FROM authors a -- now we can delete all dupes
USING dupes d
WHERE a.id = d.dupe;
COMMIT;
The temp table could hold many rows to remove many dupes at once.
Repeat the first two steps for every tables referencing authors.id
. If there are many I would create and execute the statements dynamically ...
I lock the tables explicitly to avoid concurrent disturbances.
Automation
A basic function could look like this:
CREATE OR REPLACE FUNCTION f_remove_dupe(_tbl text, _col text, _dupe int, _org int)
RETURNS void AS
$func$
DECLARE
_ftbl text;
_fcol text;
BEGIN
FOR _ftbl, _fcol IN
-- table and column name behind all referencing FKs
SELECT c.conrelid::regclass::text, f.attname
FROM pg_attribute a
JOIN pg_constraint c ON a.attrelid = c.confrelid AND a.attnum = c.confkey[1]
JOIN pg_attribute f ON f.attrelid = c.conrelid AND f.attnum = c.conkey[1]
WHERE a.attrelid = _tbl::regclass
AND a.attname = _col
AND c.contype = 'f'
LOOP
EXIT WHEN _ftbl IS NULL; -- skip if not found
EXECUTE format('
UPDATE %1$s
SET %2$I = $2
WHERE %2$I = $1'
, _ftbl, _fcol)
USING _dupe, _org;
END LOOP;
EXECUTE format('
DELETE FROM %I WHERE %I = $1'
, _tbl, _col)
USING _dupe;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT f_remove_dupe('authors', 'id', 2, 1);
This simple version ...
- ... only works for a single dupe.
- ... ignores
UNIQUE
constraints in referencing tables. - ... assumes all FK constraints only use the one column, ignoring multi-column FKs
- ... ignores possible interference from concurrent transactions.
Adapt to your requirements.
Related:
- Find referenced field(s) of foreign key constraint
- Can't execute dynamic DDL, argument is NULL
- SQL injection in Postgres functions vs prepared queries