How does pgBouncer help to speed up Django
Besides saving the overhead of connect & disconnect where this is otherwise done on each request, a connection pooler can funnel a large number of client connections down to a small number of actual database connections. In PostgreSQL, the optimal number of active database connections is usually somewhere around ((2 * core_count) + effective_spindle_count). Above this number, both throughput and latency get worse. NOTE: Recent versions have improved concurrency, so in 2022 I would recommend something more like ((4 * core_count) + effective_spindle_count).
Sometimes people will say "I want to support 2000 users, with fast response time." It is pretty much guaranteed that if you try to do that with 2000 actual database connections, performance will be horrible. If you have a machine with four quad-core processors and the active data set is fully cached, you will see much better performance for those 2000 users by funneling the requests through about 35 database connections.
To understand why that is true, this thought experiment should help. Consider a hypothetical database server machine with only one resource to share -- a single core. This core will time-slice equally among all concurrent requests with no overhead. Let's say 100 requests all come in at the same moment, each of which needs one second of CPU time. The core works on all of them, time-slicing among them until they all finish 100 seconds later. Now consider what happens if you put a connection pool in front which will accept 100 client connections but make only one request at a time to the database server, putting any requests which arrive while the connection is busy into a queue. Now when 100 requests arrive at the same time, one client gets a response in 1 second; another gets a response in 2 seconds, and the last client gets a response in 100 seconds. Nobody had to wait longer to get a response, throughput is the same, but the average latency is 50.5 seconds rather than 100 seconds.
A real database server has more resources which can be used in parallel, but the same principle holds, once they are saturated, you only hurt things by adding more concurrent database requests. It is actually worse than the example, because with more tasks you have more task switches, increased contention for locks and cache, L2 and L3 cache line contention, and many other issues which cut into both throughput and latency. On top of that, while a high work_mem
setting can help a query in a number of ways, that setting is the limit per plan node for each connection, so with a large number of connections you need to leave this very small to avoid flushing cache or even leading to swapping, which leads to slower plans or such things as hash tables spilling to disk.
Some database products effectively build a connection pool into the server, but the PostgreSQL community has taken the position that since the best connection pooling is done closer to the client software, they will leave it to the users to manage this. Most poolers will have some way to limit the database connections to a hard number, while allowing more concurrent client requests than that, queuing them as necessary. This is what you want, and it should be done on a transactional basis, not per statement or connection.
PgBouncer reduces the latency in establishing connections by serving as a proxy which maintains a connection pool. This may help speed up your application if you're opening many short-lived connections to Postgres. If you only have a small number of connections, you won't see much of a win.