Connection pooling explained

Most applications do not need many database connections, they need to reuse a few of them well. Opening a fresh connection for every request looks harmless in development and falls over in production, because a database connection carries real cost on the server and databases cap how many can exist at once. A connection pool keeps a small set of established connections and hands them out as work arrives. This post explains why a connection is expensive, what a pool does, the difference between a client side pool and a server side pooler, and how to size one without making it worse.

Why a connection is not free

Establishing a connection means a network round trip and authentication before any query runs, and each live connection then consumes resources on the server for its whole lifetime. How much, and of what, depends on the engine.

PostgreSQL: a process per connection

PostgreSQL uses a process per user model. The server listens on a port and spawns a new backend operating system process for every connection, and each client talks to exactly one backend. Processes are not cheap, so the max_connections setting both caps concurrency and sizes server resources: PostgreSQL allocates certain resources, including shared memory, in proportion to max_connections, and the value can only be changed at server start. The default is typically 100. This is why PostgreSQL in particular benefits from keeping the number of real connections low, and why pushing max_connections up to paper over a connection leak tends to hurt.

MySQL: a thread per connection

MySQL's default model runs one thread per client connection rather than a process. Threads are lighter than processes, but they are not free either: there are as many threads as connected clients, and thread creation and disposal becomes expensive while each thread needs server and kernel resources such as stack space. MySQL's max_connections defaults to 151, and the server actually permits one more than that, reserving the extra slot for an administrator with the CONNECTION_ADMIN privilege so you can still get in when the pool has exhausted the limit.

What a connection pool does

A pool opens a number of connections up front and keeps them alive. When the application needs to run a query it borrows a connection from the pool, uses it, and returns it, rather than opening and closing one. The connection setup cost is paid a handful of times at startup instead of on every request, and the number of real connections the database sees is bounded by the pool size no matter how many requests are in flight. That bound is the main prize: it protects the database from being swamped.

Client side pools versus server side poolers

Pooling can live in two places, and they solve overlapping but different problems.

A pool inside the application

A client side pool runs inside the application process. In the Java world HikariCP is the common choice, pooling JDBC connection objects within each instance of the app. This is simple and fast, but the bound is per instance: ten app instances with a pool of twenty each can still open two hundred connections to the database.

A pooler in front of the database

A server side pooler is a separate process that sits between the clients and the database and funnels many client connections down to a few database connections. PgBouncer is the standard example for PostgreSQL, describing itself as a lightweight connection pooler. Because it is central, it can hold the total number of real backends low across every app instance at once, which is exactly the pressure PostgreSQL's process per connection model is sensitive to.

PgBouncer offers three pooling modes, and the difference matters:

Session pooling: a server connection is tied to a client for as long as the client stays connected. The most compatible mode, and the least aggressive.
Transaction pooling: a server connection is assigned to a client only for the duration of a transaction, then returned to the pool. This packs far more clients onto few connections.
Statement pooling: like transaction pooling, but multi statement transactions are disallowed, effectively forcing autocommit.

Transaction pooling breaks some client expectations by design, so it can only be used if the application avoids features that rely on session state. Without extra configuration, things that do not work in transaction mode include SET and RESET, LISTEN, cursors held across transactions, PREPARE and DEALLOCATE, session level temporary tables, the LOAD statement, and session level advisory locks. If your code leans on any of those, use session pooling or change the code.

Sizing a pool

A bigger pool is not a faster one. HikariCP's guidance on pool sizing is blunt: you want a small pool saturated with threads waiting for a connection, not a large one. As a starting point it borrows a formula from the PostgreSQL community:

connections = (core_count * 2) + effective_spindle_count

For a four core machine with a single spinning disk that lands at roughly nine or ten connections, far fewer than most people guess. The spindle term is an input or output concurrency proxy that solid state storage complicates, so treat the formula as a heuristic and a place to begin, then measure your own workload. The recurring finding behind the advice is that throughput flattens and latency climbs well before the pool gets large, because the database can only do so much work in parallel regardless of how many connections are queued at it.

Stacking pools

The two layers compose. A common topology is a client side pool such as HikariCP inside each application instance, pointed at a server side pooler such as PgBouncer in transaction mode, pointed at a small number of real PostgreSQL backends. The application pool gives each instance fast local reuse, and the central pooler keeps the total backend count low across all instances. This is an architectural pattern rather than a single documented recommendation, so size both layers deliberately: the app pools should not collectively demand more than the pooler, and the pooler should not demand more backends than the database is configured for.

Conclusion

Connections cost real resources, a process each in PostgreSQL and a thread each in MySQL, and both engines cap how many can exist. A pool turns that cost into a fixed, small set of reused connections and shields the database from request spikes. Decide where the pool lives, in the application, in front of the database, or both, choose a PgBouncer mode your code can actually tolerate, and resist the urge to make the pool large. Small and busy beats large and idle, and the database will thank you for it.