Connection Pool and thread pool setting in Java
You might want to think about a few more parameters:
1. Max concurrent request parameter for the database. Cloud providers have different limits of concurrent requests for different tiers, you might want to check yours.
2. When you say 50-200 ms, although it is difficult to say, are there 8 requests of 50ms and 2 requests of 200ms on an average or all of them pretty much the same? Why? Your doQuery might be limited by the query taking maximum time (which is 200ms), but the threads taking 50 ms will get released after it's task is done making them available for next set of requests.
3. What is the QPS you are expecting to receive?
Some calculations: If a single request takes 10 threads, and you have provisioned 100 connections with 100 concurrent query limit, assuming 200ms for each query, you can only handle 10 requests at a time. Maybe a little better than 10 if most queries take 50ms or so (but I wouldn't be optimistic).
Of course, some of these calculations goes for a toss if any of your queries takes >200ms (network latency or anything else) , in which case I recommend you have a circuit breaker, either at the connection end (if you are allowed to abort the query after a timeout) or at the API end.
Note : max connection limit is not the same as max concurrent query limit.
Suggestion: Since you need response under 500ms, You can also have a connectionTimeout of about 100-150ms on the pool. Worst case: 150ms connection timeout + 200ms query execution + 100ms for application processing < 500ms for your response. Works.