Estimate end time for a long update

I know this question is old but I monitore all my updates in postgresql including updates inside transactions.. So I decided to help!

1- First create a sequence named myprogress for example.

create sequence myprogress; 

2- Make your update to hit the sequence - now you are free to start a transaction:

For example if you have this update:

update mytable set c=3 where a=0 and b=1; 

Just rewrite it to (plus this and condition):

update mytable set c=3 where ( a=0 and b=1 ) and  NEXTVAL('myprogress')!=0;

**You are ok to run this update inside a transaction, as sequences are affected globally. **

Note that this will not significantly affect performance.

3- Monitore your progress. Now you can just connect in another session (or a transaction) and select your sequence (sequences are affected globally , so you will see the value in the another sessions):

select last_value from myprogress; 

And you will see how many lines are affected by your query until now, with that you can estimate how many lines by second are hit by your update.

And in most of the cases how many time you will need to wait...

4- At end just restart the sequence:

alter sequence myprogress restart with 1; 

So you can use it again - But carefully, to do not trust this if two users use the same sequence. If you are in doubt better if you have your own progress sequence with permissions only for you.

You can use it for slow SELECTs, DELETEs, and for some INSERTs too !


You cannot reliably predict when it's going to be done.
But you can investigate causes for the unexpected slow progress.

Check for locks from concurrent transactions:
https://wiki.postgresql.org/wiki/Lock_Monitoring

Maybe there are too many concurrent connections?
https://wiki.postgresql.org/wiki/Number_Of_Database_Connections

Or stalled / stalling transactions / sessions blocking / locking resources? In Postgres 9.2 or later, check pg_stat_activity for suspicious activity:

SELECT * FROM pg_stat_activity;

In particular, check for:

  • state = 'idle': mostly harmless: an open session that doesn't do anything. You'll have some of these with connection poolers for instance.

  • state = 'idle in transaction': potential problem! An open transaction that has not committed and is still doing nothing.

  • waiting = TRUE: potential problem! An open transaction that is currently waiting on a lock.


Of course, your server may also be burdened with load outside the RDBMS. I would start with top in a Linux / Unix system ... but that's beyond the scope of this question.

Tags:

Postgresql