JDBC connection to very busy SQL 2000: selectMethod=cursor vs selectMethod=direct?

Briefly,

  1. selectMethod=cursor
    • theoretically requires more server-side resources than selectMethod=direct
    • only loads at most batch-size records into client memory at once, resulting in a more predictable client memory footprint
  2. selectMethod=direct
    • theoretically requires less server-side resources than selectMethod=cursor
    • will read the entire result set into client memory (unless the driver natively supports asynchronous result set retrieval) before the client application can iterate over it; this can reduce performance in two ways:
      1. reduced performance with large result sets if the client application is written in such a way as to stop processing after traversing only a fraction of the result set (with direct it has already paid the cost of retrieving data it will essentially throw away; with cursor the waste is limited to at most batch-size - 1 rows -- the early termination condition should probably be recoded in SQL anyway e.g. as SELECT TOP or window functions)
      2. reduced performance with large result sets because of potential garbage collection and/or out-of-memory issues associated with an increased memory footprint

In summary,

  • Can using selectMethod=cursor lower application performance? -- either method can lower performance, for different reasons. Past a certain resultset size, cursor may still be preferable. See below for when to use one or the other
  • Is selectMethod= an application-transparent setting on a JDBC connection? -- it is transparent, but it can still break their app if memory usage grows significantly enough to hog their client system (and, correspondingly, your server) or crash the client altogether
  • More generally, when should you use cursor vs direct? -- I personally use cursor when dealing with potentially large or otherwise unbounded result sets. The roundtrip overhead is then amortized given a large enough batch size, and my client memory footprint is predictable. I use direct when the size of the result set I expect is known to be inferior to whatever batch size I use with cursor, or bound in some way, or when memory is not an issue.