What are the pros and cons of performing calculations in sql vs. in your application

It depends on a lot of factors - but most crucially:

  • complexity of calculations (prefer doing complex crunching on an app-server, since that scales out; rather than a db server, which scales up)
  • volume of data (if you need to access/aggregate a lot of data, doing it at the db server will save bandwidth, and disk io if the aggregates can be done inside indexes)
  • convenience (sql is not the best language for complex work - especially not great for procedural work, but very good for set-based work; lousy error-handling, though)

As always, if you do bring the data back to the app-server, minimising the columns and rows will be to your advantage. Making sure the query is tuned and appropriately indexed will help either scenario.

Re your note:

and then loop through the records

Looping through records is almost always the wrong thing to do in sql - writing a set-based operation is preferred.

As a general rule, I prefer to keep the database's job to a minimum "store this data, fetch this data" - however, there are always examples of scenarios where an elegant query at the server can save a lot of bandwidth.

Also consider: if this is computationally expensive, can it be cached somewhere?

If you want an accurate "which is better"; code it both ways and compare it (noting that a first draft of either is likely not 100% tuned). But factor in typical usage to that: if, in reality, it is being called 5 times (separately) at once, then simulate that: don't compare just a single "1 of these vs 1 of those".


Let me use a metaphor: if you want to buy a golden necklace in Paris, the goldsmith could sit in Cape Town or Paris, that is a matter of skill and taste. But you would never ship tons of gold ore from South Africa to France for that. The ore is processed at the mining site (or at least in the general area), only the gold gets shipped. The same should be true for apps and databases.

As far as PostgreSQL is concerned, you can do almost anything on the server, quite efficiently. The RDBMS excels at complex queries. For procedural needs you can choose from a variety of server-side script languages: tcl, python, perl and many more. Mostly I use PL/pgSQL, though.

Worst case scenario would be to repeatedly go to the server for every single row of a larger set. (That would be like shipping one ton of ore a time.)

Second in line, if you send a cascade of queries, each depending on the one before, while all of it could be done in one query or procedure on the server. (That's like shipping the gold, and each of the jewels with a separate ship, sequentially.)

Going back and forth between app and server is expensive. For server and client. Try to cut down on that, and you will win - ergo: use server side procedures and / or sophisticated SQL where necessary.

We just finished a project where we packed almost all complex queries into Postgres functions. The app hands over parameters and gets the datasets it needs. Fast, clean, simple (for the app developer), I/O reduced to a minimum ... a shiny necklace with a low carbon footprint.


In this case you are probably slightly better off doing the calculation in SQL as the database engine is likely to have a more efficient decimal arithmetic routines than Java.

Generally though for row level calculations there is not much difference.

Where it does make a difference is:

  • Aggregate calculations like SUM(), AVG(),MIN(), MAX() here the database engine will be an order of magnitude faster than a Java implementation.
  • Anywhere the calculation is used to filter rows. Filtering at the DB is much more efficient than reading a row and then discarding it.