Return pre-UPDATE column values using SQL only
Problem
The manual explains:
The optional
RETURNING
clause causesUPDATE
to compute and return value(s) based on each row actually updated. Any expression using the table's columns, and/or columns of other tables mentioned inFROM
, can be computed. The new (post-update) values of the table's columns are used. The syntax of theRETURNING
list is identical to that of the output list ofSELECT
.
Bold emphasis mine. There is no way to access the old row in a RETURNING
clause. You can work around this restriction with a trigger or a separate SELECT
before the UPDATE
wrapped in a transaction or wrapped in a CTE as was commented.
However, what you are trying to achieve works perfectly fine if you join to another instance of the table in the FROM
clause:
Solution without concurrent writes
UPDATE tbl x
SET tbl_id = 23
, name = 'New Guy'
FROM tbl y -- using the FROM clause
WHERE x.tbl_id = y.tbl_id -- must be UNIQUE NOT NULL
AND x.tbl_id = 3
RETURNING y.tbl_id AS old_id, y.name AS old_name
, x.tbl_id , x.name;
Returns:
old_id | old_name | tbl_id | name
--------+----------+--------+---------
3 | Old Guy | 23 | New Guy
The column(s) used to self-join must be UNIQUE NOT NULL
. In the simple example, the WHERE
condition is on the same column tbl_id
, but that's just coincidence. Works for any conditions.
I tested this with PostgreSQL versions from 8.4 to 13.
It's different for INSERT
:
- INSERT INTO ... FROM SELECT ... RETURNING id mappings
Solutions with concurrent write load
There are various ways to avoid race conditions with concurrent write operations on the same rows. (Note that concurrent write operations on unrelated rows are no problem at all.) The simple, slow and sure (but expensive) method is to run the transaction with SERIALIZABLE
isolation level:
BEGIN ISOLATION LEVEL SERIALIZABLE;
UPDATE ... ;
COMMIT;
But that's probably overkill. And you need to be prepared to repeat the operation in case of a serialization failure.
Simpler and faster (and just as reliable with concurrent write load) is an explicit lock on the one row to be updated:
UPDATE tbl x
SET tbl_id = 24
, name = 'New Gal'
FROM (SELECT tbl_id, name FROM tbl WHERE tbl_id = 4 FOR UPDATE) y
WHERE x.tbl_id = y.tbl_id
RETURNING y.tbl_id AS old_id, y.name AS old_name
, x.tbl_id , x.name;
Note how the WHERE
condition moved to the subquery (again, can be anything), and only the self-join (on UNIQUE NOT NULL
column(s)) remains in the outer query. This guarantees that only rows locked by the inner SELECT
are processed. The WHERE
conditions might resolve to a different set of rows a moment later.
See:
- Atomic UPDATE .. SELECT in Postgres
db<>fiddle here
Old sqlfiddle
You can use a SELECT
subquery.
Example: Update a user's email RETURNING
the old value.
RETURNING
SubqueryUPDATE users SET email = '[email protected]' WHERE id = 1 RETURNING (SELECT email FROM users WHERE id = 1);
PostgreSQL WITH Query (Common Table Expressions)
WITH u AS ( SELECT email FROM users WHERE id = 1 ) UPDATE users SET email = '[email protected]' WHERE id = 1 RETURNING (SELECT email FROM u);
This has worked several times on my local database without fail, but I'm not sure if the
SELECT
inWITH
is guaranteed to consistently execute before theUPDATE
since "the sub-statements in WITH are executed concurrently with each other and with the main query."