ServerlessBase Blog
  • Database Connection Pooling Explained

    Learn how database connection pooling improves application performance by reusing established connections instead of creating new ones for each request.

    Database Connection Pooling Explained

    You've probably seen this error message in your logs: FATAL: sorry, too many clients already. It happens when your application tries to connect to a database but hits the maximum allowed connections. For PostgreSQL, the default is 100 connections. For MySQL, it's often 151. If you have multiple application instances or a busy web application, you'll hit that limit quickly.

    Database connection pooling solves this problem by maintaining a pool of reusable connections instead of creating a new connection for every request. When your application needs to talk to the database, it borrows a connection from the pool, uses it, and returns it. This approach dramatically reduces connection overhead and prevents hitting connection limits.

    The Connection Problem

    Every database connection is expensive. When your application connects to a database, several things happen under the hood:

    1. Authentication: The database verifies your credentials
    2. Handshake: TCP handshake establishes the network connection
    3. Protocol Negotiation: The database and client negotiate the protocol version
    4. Authorization: The database checks your permissions
    5. Resource Allocation: The database reserves memory and resources for your session

    This process can take 50-200 milliseconds depending on your network and database configuration. If your application handles 1000 requests per second, that's 50-200 seconds of wasted time just establishing connections.

    On top of that, each connection consumes memory in the database server. PostgreSQL allocates about 1-2 MB per connection for session memory. With 1000 connections, that's 1-2 GB of memory just for connection overhead.

    How Connection Pooling Works

    Connection pooling maintains a pool of open database connections. When your application needs to query the database, it requests a connection from the pool. The pool either provides an existing idle connection or creates a new one if none are available.

    After the query completes, the connection isn't closed. Instead, it's returned to the pool and marked as available for future requests. This reuse eliminates the overhead of connection establishment for subsequent queries.

    Connection Pool Lifecycle

    Application Request → Check Pool → Get Connection → Execute Query → Return Connection → Pool Reuses Connection

    The pool typically has a minimum and maximum size. The minimum represents the number of connections that should always be available. The maximum is the upper limit of connections the pool will create.

    When all connections are in use and the pool is at its maximum size, new requests wait until a connection becomes available. This is why you see "too many clients" errors when the pool size exceeds the database's maximum connections setting.

    Connection Pooling Implementation

    Most modern application frameworks include built-in connection pooling. Here's how it works in practice.

    Example: Node.js with pg-pool

    const { Pool } = require('pg');
     
    const pool = new Pool({
      user: 'app_user',
      host: 'localhost',
      database: 'myapp',
      password: 'secret',
      port: 5432,
      max: 20, // Maximum number of connections in the pool
      idleTimeoutMillis: 30000, // Close idle connections after 30 seconds
      connectionTimeoutMillis: 2000, // Return error if connection can't be established within 2 seconds
    });
     
    async function getUser(id) {
      const client = await pool.connect();
      try {
        const result = await client.query('SELECT * FROM users WHERE id = $1', [id]);
        return result.rows[0];
      } finally {
        client.release(); // Return connection to pool
      }
    }

    In this example, the pool maintains up to 20 connections. When getUser is called, it borrows a connection, executes the query, and returns the connection to the pool. The connection isn't closed; it's reused for the next request.

    Example: Python with SQLAlchemy

    from sqlalchemy import create_engine
    from sqlalchemy.orm import sessionmaker
     
    engine = create_engine(
        'postgresql://app_user:secret@localhost/myapp',
        pool_size=10,
        max_overflow=20,
        pool_timeout=30,
        pool_recycle=3600,
    )
     
    Session = sessionmaker(bind=engine)
     
    def get_user(user_id):
        session = Session()
        try:
            user = session.query(User).filter_by(id=user_id).first()
            return user
        finally:
            session.close()  # Return connection to pool

    SQLAlchemy's engine manages a connection pool by default. The pool_size parameter sets the number of connections to keep in the pool, and max_overflow allows temporary connections beyond the pool size during peak loads.

    Connection Pool Configuration

    Choosing the right pool size is critical. Too small and you'll have connection contention. Too large and you'll waste database resources.

    Factors to Consider

    Application Concurrency: How many concurrent requests does your application handle? If you have 10 web servers handling 100 requests each, you need enough connections to handle peak load.

    Database Capacity: How many connections can your database handle? PostgreSQL defaults to 100 connections. MySQL defaults to 151. Don't exceed these limits.

    Connection Overhead: How much time does connection establishment take? If it's 100ms, you can afford a larger pool. If it's 10ms, a smaller pool might suffice.

    Resource Usage: How much memory does each connection consume? PostgreSQL uses about 1-2 MB per connection. With 1000 connections, that's 1-2 GB of memory.

    Pool Size Guidelines

    Application TypePool SizeNotes
    Single-threaded script1-5Minimal concurrency
    Small web app (10-50 req/s)10-20Low to moderate concurrency
    Medium web app (50-200 req/s)20-50High concurrency
    Large web app (200+ req/s)50-100+Very high concurrency

    These are starting points. You'll need to monitor your application and adjust based on actual usage patterns.

    Connection Pooling vs Connection Per Request

    To understand the value of pooling, compare it with the alternative: creating a new connection for every request.

    Connection Per Request

    async function getUser(id) {
      const client = await new Pool({
        user: 'app_user',
        host: 'localhost',
        database: 'myapp',
        password: 'secret',
        port: 5432,
      }).connect();
     
      try {
        const result = await client.query('SELECT * FROM users WHERE id = $1', [id]);
        return result.rows[0];
      } finally {
        client.end(); // Close the connection
      }
    }

    This approach creates a new connection for every request. If your application handles 100 requests per second, that's 100 new connections per second. Over time, this will exhaust your database's connection limit.

    Connection Pooling

    const pool = new Pool({
      user: 'app_user',
      host: 'localhost',
      database: 'myapp',
      password: 'secret',
      port: 5432,
      max: 20,
    });
     
    async function getUser(id) {
      const client = await pool.connect();
      try {
        const result = await client.query('SELECT * FROM users WHERE id = $1', [id]);
        return result.rows[0];
      } finally {
        client.release(); // Return to pool instead of closing
      }
    }

    With pooling, the same 20 connections handle all 100 requests per second. The connections are reused, eliminating the overhead of connection establishment and preventing connection exhaustion.

    Common Connection Pooling Pitfalls

    1. Leaked Connections

    A common mistake is forgetting to release connections back to the pool. If an exception occurs after borrowing a connection but before releasing it, the connection is lost.

    async function getUser(id) {
      const client = await pool.connect();
      try {
        const result = await client.query('SELECT * FROM users WHERE id = $1', [id]);
        return result.rows[0];
      } catch (error) {
        // Oops! Forgot to release the connection before returning
        throw error;
      }
    }

    The fix is to always release connections in a finally block:

    async function getUser(id) {
      const client = await pool.connect();
      try {
        const result = await client.query('SELECT * FROM users WHERE id = $1', [id]);
        return result.rows[0];
      } finally {
        client.release(); // Always release, even on error
      }
    }

    2. Connection Timeout Issues

    If your application frequently waits for a connection from the pool, you might need to increase the pool size or reduce the timeout.

    const pool = new Pool({
      max: 50, // Increase pool size
      connectionTimeoutMillis: 5000, // Increase timeout to 5 seconds
    });

    3. Connection Exhaustion

    If you see "too many clients" errors, your pool size exceeds the database's maximum connections setting. Reduce the pool size or increase the database's max_connections setting.

    -- PostgreSQL
    ALTER SYSTEM SET max_connections = 200;
    SELECT pg_reload_conf();

    4. Long-Running Queries

    Long-running queries can hold connections in the pool for extended periods, reducing the pool's availability. Consider using connection timeouts to automatically close idle connections.

    const pool = new Pool({
      idleTimeoutMillis: 30000, // Close idle connections after 30 seconds
    });

    Connection Pooling in Different Databases

    PostgreSQL

    PostgreSQL includes a built-in connection pooler called pgbouncer. It's a separate process that manages connections between your application and PostgreSQL.

    # pgbouncer.ini
    [databases]
    myapp = host=localhost port=5432 dbname=myapp
     
    [pgbouncer]
    pool_mode = transaction
    max_client_conn = 1000
    default_pool_size = 20
    reserve_pool_size = 5
    reserve_pool_timeout = 3

    pgbouncer is particularly useful for high-traffic applications because it can handle thousands of connections while PostgreSQL only handles a few dozen.

    MySQL

    MySQL doesn't include a built-in connection pooler, but most application drivers include pooling functionality. For example, the MySQL Connector/J driver includes a connection pool.

    import com.mysql.cj.jdbc.MysqlConnectionPoolDataSource;
     
    MysqlConnectionPoolDataSource dataSource = new MysqlConnectionPoolDataSource();
    dataSource.setURL("jdbc:mysql://localhost:3306/myapp");
    dataSource.setUser("app_user");
    dataSource.setPassword("secret");
    dataSource.setPoolSize(20);

    Redis

    Redis also benefits from connection pooling. The Redis client for Node.js includes a built-in pool.

    import { createClient } from 'redis';
     
    const client = createClient({
      url: 'redis://localhost:6379',
      socket: {
        connectTimeout: 5000,
      },
      pool: {
        min: 5,
        max: 50,
      },
    });

    Monitoring Connection Pool Health

    Monitoring your connection pool is essential for maintaining optimal performance.

    Key Metrics to Track

    Active Connections: Number of connections currently in use Idle Connections: Number of connections waiting to be reused Wait Time: How long requests wait for a connection Connection Errors: Number of connection failures

    Monitoring Tools

    Most application frameworks provide built-in monitoring. For example, in Node.js with pg-pool:

    pool.on('connect', () => console.log('New connection'));
    pool.on('acquire', (connection) => console.log('Connection acquired'));
    pool.on('release', (connection) => console.log('Connection released'));
    pool.on('error', (err) => console.error('Pool error:', err));

    For production applications, consider using a monitoring solution like Prometheus and Grafana to track connection pool metrics over time.

    Conclusion

    Database connection pooling is a fundamental optimization technique that every application developer should understand. By reusing connections instead of creating new ones for every request, you reduce overhead, prevent connection exhaustion, and improve application performance.

    The key takeaways are:

    1. Connections are expensive: Establishing a connection takes time and consumes database resources
    2. Pooling reuses connections: Connections are borrowed, used, and returned to the pool instead of being closed
    3. Choose the right pool size: Too small and you'll have contention; too large and you'll waste resources
    4. Monitor pool health: Track active, idle, and wait times to ensure optimal performance
    5. Handle errors properly: Always release connections in a finally block to prevent leaks

    If you're using a platform like ServerlessBase, connection pooling is handled automatically, so you can focus on building your application without worrying about connection management.

    Next Steps

    1. Review your application's connection pool configuration: Check if your pool size is appropriate for your workload
    2. Monitor connection pool metrics: Set up monitoring to track active, idle, and wait times
    3. Test under load: Simulate your expected traffic patterns to verify your pool configuration
    4. Consider pgbouncer for PostgreSQL: For high-traffic applications, pgbouncer can handle thousands of connections while PostgreSQL manages a smaller pool

    Leave comment