Very slow DELETE in PostgreSQL, workaround?
I had a similar problem. As it turns out, those ON DELETE CASCADE
triggers were slowing things down quite a bit, because those cascaded deletions were awfully slow.
I solved the problem by creating indexes on the foreign key fields on the referencing tables, and I went from taking a bunch of hours for the deletion to a few seconds.
You have a few options. The best option is to run a batch delete so that triggers are not hit. Disable the triggers before deleting, then re-enable them. This saves you a very large amount of time. For example:
ALTER TABLE tablename DISABLE TRIGGER ALL;
DELETE ...;
ALTER TABLE tablename ENABLE TRIGGER ALL;
A major key here is you want to minimize the depth of subqueries. In this case you may want to set up temp tables to store relevant information so you can avoid deep subqueries on your delete.
The easiest method to solve the problem is to query detailed timing from the PostgreSQL: EXPLAIN
. For this you need to find at minimum a single query that does complete but takes longer than expected. Let's say that this line would look like
delete from mydata where id='897b4dde-6a0d-4159-91e6-88e84519e6b6';
Instead of really running that command you can do
begin;
explain (analyze,buffers,timing)
delete from mydata where id='897b4dde-6a0d-4159-91e6-88e84519e6b6';
rollback;
The rollback
at the end allows running this without really modifying the database. You still get the detailed timing of what took how much. After running that, you may find in the output that some trigger causes huge delays:
...
Trigger for constraint XYZ123: time=12311.292 calls=1
...
The time
is in ms (millisecond) so checking this contraint took about 12.3 seconds. You need to add a new INDEX
over the required columns so that this trigger can be computed effectively. For foreign key references the column that references to another table must be indexed (that is, the source column, not the target column). PostgreSQL does not automatically create such indexes for you and DELETE
is the only common query where you really really need that index. As a result, you may have accumulated years of data until you hit the case where DELETE
is too slow due missing an index.
Once you have fixed performance of that constraint (or some other thing that took overly long time), repeat the command in begin
/rollback
block so you can compare the new execution time to previous. Continue until you're happy with the single line delete response time (I got one query to go from 25.6 seconds to 15 ms or about 1700x faster simply by adding different indexes). Then you can proceed to complete your full delete without any hacks.
(Note that EXPLAIN
needs a query that can complete successfully. I once had a problem where PostgreSQL took overly long to figure out that one delete was going to violate a foreign key constraint and in that case EXPLAIN
cannot be used because it will not emit timing for failed queries. I don't know any easy to way to debug performance issues in such a case.)