UPDATE vs INSERT performance

If you plan to perform a large processing (such as rating or billing for a cellular company), this question has a huge impact on system performance.

Performing large scale updates vs making many new tables and index has proven to reduce my company billing process form 26 hours to 1 hour!

I have tried it on 2 million records for 100,000 customer.
I first created the billing table and then every customer summary calls, I updated the billing table with the duration, price, discount.. a total of 10 fields.

In the second option I created 4 phases.
Each phase reads the previous table(s), creates index (after the table insert completed) and using: "insert into from select .." I have created the next table for the next phase.

Summary
Although the second alternative requires much more disk space (all views and temporary tables deleted at the end) there are 3 main advantages to this option:

  1. It was 4 time faster than option 1.
  2. In case there was a problem in the middle of the process I could start the process from the point it failed, as all the tables for the beginning of the phase were ready and the process could restart from this point. If the process fails implementing the first option, you will need to start the all the process all over again.
  3. This made the development and QA work much faster as they could work parallel.

I am not a database guru but here my two cents:

Personally I don't think you have much to do in this regard, even if INSERT would be faster (all to be proven), can you convert an update in an insert?! Frankly I don't think you can do it all the times.

During an INSERT you don't usually have to use WHERE to identify which row to update but depending on your indices on that table the operation can have some cost.

During an update if you do not change any column included in any indices you could have quick execution, if the where clause is easy and fast enough.

Nothing is written in stones and really I would imagine it depends on whole database setup, indices and so on.

Anyway, found this one as a reference:

Top 84 MySQL Performance Tips


You cannot compare an INSERT and an UPDATE in general. Give us an example (with schema definition) and we will explain which one costs more and why. Also, you can compere a concrete INSERT and an UPDATE by checking their plan and execution time.

Some rules of thumbs though:

  • if you only update only one field, which is not indexed and you only update one record and you use rowid/primary key to find that record then this UPDATE will cost less, than
  • an INSERT, which will also affect only one row, though this row will have many not null constrained, indexed fields; and all those indexes have to be maintained (e.g. add a new leaf)

The key resource here is disk access (IOPS to be precise) and we should evaluate which ones results in minimum of that.

Agree with others on how it is impossible to give a generic answer but some thoughts to lead you in the right direction , assume a simple key-value store and key is indexed. Insertion is inserting a new key and update is updating the value of an existing key.

If that is the case (a very common case) , update would be faster than insertion because update involves an indexed lookup and changing an existing value without touching the index. You can assume that is one disk read to get the data and possibly one disk write. On the other hand insertion would involve two disk writes one for index , one for data. But the another hidden cost is the btree node splitting and new node creation which would happen in background while insertion leading to more disk access on average.

Tags:

Mysql

Sql