ServerlessBase Blog
  • Database Anti-Patterns to Avoid

    Common database design mistakes that hurt performance, scalability, and maintainability

    Database Anti-Patterns to Avoid

    You've probably inherited a database schema that makes you want to scream. Tables with cryptic names, columns that should be separate entities, indexes that slow down writes instead of speeding up reads. These aren't just bad design choices—they're anti-patterns that plague production systems and cost teams hours of debugging.

    This article covers the most common database anti-patterns you'll encounter, why they're problematic, and how to recognize them before they become technical debt.

    The Denormalization Trap

    Denormalization sounds appealing on paper. You join fewer tables, reduce query complexity, and improve read performance. But when taken to extremes, it creates a maintenance nightmare.

    The Problem

    Consider this scenario: you have an orders table with customer information embedded directly in it.

    CREATE TABLE orders (
      id INT PRIMARY KEY,
      customer_name VARCHAR(100),
      customer_email VARCHAR(100),
      order_date DATE,
      total DECIMAL(10, 2),
      -- more columns...
    );

    Every time a customer updates their email, you must update every single order they've ever placed. Every time you need to query customers by email, you're scanning the orders table instead of a dedicated customers table.

    The Better Approach

    Keep your data normalized and use proper indexing.

    CREATE TABLE customers (
      id INT PRIMARY KEY,
      name VARCHAR(100),
      email VARCHAR(100) UNIQUE,
      created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );
     
    CREATE TABLE orders (
      id INT PRIMARY KEY,
      customer_id INT,
      order_date DATE,
      total DECIMAL(10, 2),
      FOREIGN KEY (customer_id) REFERENCES customers(id)
    );

    Now you can update a customer's email in one place, and queries are faster because the database can use indexes on the dedicated tables.

    The "SELECT *" Anti-Pattern

    Selecting all columns with SELECT * is one of the most common mistakes in database queries. It seems convenient—you don't have to remember column names—but it creates significant problems.

    The Problem

    When you select all columns, you're pulling more data than you need. This wastes network bandwidth, increases memory usage, and can slow down query execution. In some databases, SELECT * can prevent the query optimizer from using certain indexes effectively.

    -- Bad: Pulling 50 columns when you only need 3
    SELECT * FROM users WHERE id = 123;

    The Better Approach

    Always specify the columns you need.

    -- Good: Only pulling what you need
    SELECT id, email, created_at FROM users WHERE id = 123;

    This is especially important when working with large tables or when the query is part of a larger application.

    The Monolithic Table Anti-Pattern

    Tables that try to do too much are a recipe for performance issues and maintenance headaches. A single table should represent one entity or concept.

    The Problem

    Imagine a single users table that stores everything: profile information, billing details, preferences, and audit logs.

    CREATE TABLE users (
      id INT PRIMARY KEY,
      username VARCHAR(50),
      email VARCHAR(100),
      password_hash VARCHAR(255),
      first_name VARCHAR(50),
      last_name VARCHAR(50),
      phone VARCHAR(20),
      address_line1 VARCHAR(100),
      address_line2 VARCHAR(100),
      city VARCHAR(50),
      state VARCHAR(50),
      zip_code VARCHAR(20),
      country VARCHAR(50),
      billing_address_line1 VARCHAR(100),
      billing_address_line2 VARCHAR(100),
      billing_city VARCHAR(50),
      billing_state VARCHAR(50),
      billing_zip_code VARCHAR(20),
      billing_country VARCHAR(50),
      preferences JSON,
      created_at TIMESTAMP,
      updated_at TIMESTAMP,
      last_login TIMESTAMP,
      -- more columns...
    );

    This table will be huge, slow to insert into, and difficult to query efficiently. Any change to the schema affects all users, and you can't easily partition or archive old data.

    The Better Approach

    Split data into separate tables based on its purpose.

    CREATE TABLE users (
      id INT PRIMARY KEY,
      username VARCHAR(50) UNIQUE,
      email VARCHAR(100) UNIQUE,
      created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );
     
    CREATE TABLE user_profiles (
      user_id INT PRIMARY KEY,
      first_name VARCHAR(50),
      last_name VARCHAR(50),
      phone VARCHAR(20),
      FOREIGN KEY (user_id) REFERENCES users(id)
    );
     
    CREATE TABLE user_addresses (
      id INT PRIMARY KEY,
      user_id INT,
      address_line1 VARCHAR(100),
      address_line2 VARCHAR(100),
      city VARCHAR(50),
      state VARCHAR(50),
      zip_code VARCHAR(20),
      country VARCHAR(50),
      is_billing BOOLEAN DEFAULT FALSE,
      FOREIGN KEY (user_id) REFERENCES users(id)
    );
     
    CREATE TABLE user_preferences (
      user_id INT PRIMARY KEY,
      preferences JSON,
      FOREIGN KEY (user_id) REFERENCES users(id)
    );

    Now each table is smaller, more focused, and easier to manage. You can index what matters, archive old addresses, and make schema changes without affecting unrelated data.

    The Missing Primary Key Anti-Pattern

    Not every table needs a primary key, but most do. Tables without primary keys are a red flag for database design issues.

    The Problem

    Tables that rely on natural keys or composite keys that aren't stable can cause significant problems.

    CREATE TABLE orders (
      order_date DATE,
      customer_id INT,
      product_id INT,
      quantity INT,
      -- No primary key!
    );

    Without a primary key, you can't reliably identify rows, update specific records, or create proper foreign key relationships. Duplicate orders become a real possibility.

    The Better Approach

    Always use a surrogate primary key.

    CREATE TABLE orders (
      id INT PRIMARY KEY AUTO_INCREMENT,
      order_date DATE,
      customer_id INT,
      product_id INT,
      quantity INT,
      FOREIGN KEY (customer_id) REFERENCES customers(id),
      FOREIGN KEY (product_id) REFERENCES products(id)
    );

    The surrogate key (id) is simple, stable, and doesn't change. It makes joins easier and ensures data integrity.

    The Over-Indexing Anti-Pattern

    Indexes are great for read performance, but they come with costs. Every index adds write overhead and consumes storage space.

    The Problem

    Creating indexes on every column you might query creates a maintenance burden and slows down insert/update operations.

    CREATE TABLE products (
      id INT PRIMARY KEY,
      name VARCHAR(100),
      description TEXT,
      price DECIMAL(10, 2),
      category_id INT,
      created_at TIMESTAMP,
      -- 20 indexes on various columns...
    );

    Every time you insert a product, the database must update all 20 indexes. For high-traffic tables, this can become a bottleneck.

    The Better Approach

    Index only the columns you actually query.

    CREATE TABLE products (
      id INT PRIMARY KEY,
      name VARCHAR(100),
      description TEXT,
      price DECIMAL(10, 2),
      category_id INT,
      created_at TIMESTAMP,
      INDEX idx_category (category_id),
      INDEX idx_price (price),
      INDEX idx_name (name)
    );

    Now you have indexes where they matter, and you're not paying the overhead for unused indexes.

    The N+1 Query Anti-Pattern

    This is a classic ORM anti-pattern that plagues many applications. It happens when you execute multiple queries in a loop instead of using a single query with joins.

    The Problem

    // Bad: N+1 queries
    const users = await db.query('SELECT * FROM users');
    for (const user of users) {
      const posts = await db.query(
        'SELECT * FROM posts WHERE user_id = ?',
        [user.id]
      );
      user.posts = posts;
    }

    If you have 100 users, this executes 101 queries. The database must parse, optimize, and execute each query separately.

    The Better Approach

    Use a single query with joins.

    // Good: Single query
    const users = await db.query(`
      SELECT users.*, posts.id as post_id, posts.title
      FROM users
      LEFT JOIN posts ON users.id = posts.user_id
    `);

    Now you have one query that returns all the data in a single round trip.

    The Ignoring Transaction Isolation Anti-Pattern

    Transactions are fundamental to database consistency, but they're often misused. Not understanding isolation levels can lead to data corruption or concurrency issues.

    The Problem

    Running transactions without specifying isolation levels can lead to unexpected behavior.

    -- Bad: Using default isolation level without understanding it
    BEGIN;
    UPDATE accounts SET balance = balance - 100 WHERE id = 1;
    UPDATE accounts SET balance = balance + 100 WHERE id = 2;
    COMMIT;

    If another transaction reads the accounts between these updates, it might see an inconsistent state.

    The Better Approach

    Use appropriate isolation levels for your needs.

    -- Good: Using READ COMMITTED isolation
    BEGIN TRANSACTION ISOLATION LEVEL READ COMMITTED;
    UPDATE accounts SET balance = balance - 100 WHERE id = 1;
    UPDATE accounts SET balance = balance + 100 WHERE id = 2;
    COMMIT;

    For critical operations, consider using SERIALIZABLE isolation or implementing application-level locking.

    The No Index on Foreign Keys Anti-Pattern

    Foreign keys are relationships between tables, but they don't automatically create indexes. This can make joins slow and queries inefficient.

    The Problem

    CREATE TABLE orders (
      id INT PRIMARY KEY,
      customer_id INT,
      FOREIGN KEY (customer_id) REFERENCES customers(id)
    );
     
    CREATE TABLE order_items (
      id INT PRIMARY KEY,
      order_id INT,
      product_id INT,
      FOREIGN KEY (order_id) REFERENCES orders(id)
    );

    Without indexes on the foreign key columns, queries that join these tables will scan entire tables instead of using indexes.

    The Better Approach

    Always index foreign key columns.

    CREATE TABLE orders (
      id INT PRIMARY KEY,
      customer_id INT,
      FOREIGN KEY (customer_id) REFERENCES customers(id),
      INDEX idx_customer_id (customer_id)
    );
     
    CREATE TABLE order_items (
      id INT PRIMARY KEY,
      order_id INT,
      product_id INT,
      FOREIGN KEY (order_id) REFERENCES orders(id),
      INDEX idx_order_id (order_id),
      INDEX idx_product_id (product_id)
    );

    Now joins are fast, and referential integrity checks are efficient.

    The Hardcoded Values Anti-Pattern

    Storing configuration values directly in your database schema or queries makes your application inflexible and harder to maintain.

    The Problem

    -- Bad: Hardcoded values in queries
    SELECT * FROM products WHERE category = 'electronics' AND price < 1000;

    If you need to change the category name or price threshold, you must update the query in multiple places.

    The Better Approach

    Use application-level configuration.

    const config = {
      ELECTRONICS_CATEGORY: 'electronics',
      MAX_PRICE_THRESHOLD: 1000
    };
     
    const products = await db.query(
      'SELECT * FROM products WHERE category = ? AND price < ?',
      [config.ELECTRONICS_CATEGORY, config.MAX_PRICE_THRESHOLD]
    );

    Now you can change these values in one place and they're applied everywhere.

    The Ignoring Data Types Anti-Pattern

    Using inappropriate data types wastes storage, slows down queries, and can lead to data integrity issues.

    The Problem

    -- Bad: Using VARCHAR for numeric data
    CREATE TABLE products (
      id INT PRIMARY KEY,
      price VARCHAR(20)  -- Should be DECIMAL or NUMERIC
    );

    Storing numeric data as strings prevents mathematical operations and can lead to sorting and comparison issues.

    The Better Approach

    Use the appropriate data types.

    -- Good: Using proper numeric types
    CREATE TABLE products (
      id INT PRIMARY KEY,
      price DECIMAL(10, 2)  -- Stores currency with 2 decimal places
    );

    For dates, use DATE, TIMESTAMP, or DATETIME instead of strings. For booleans, use BOOLEAN or TINYINT(1) instead of INT or VARCHAR.

    The No Query Plan Analysis Anti-Pattern

    Writing queries without understanding how they execute is like driving blindfolded. You might get where you're going, but you're likely taking inefficient routes.

    The Problem

    -- Bad: Query without understanding execution
    SELECT * FROM orders WHERE customer_id = 123;

    Is this query using an index? Is it scanning the entire table? How many rows does it return? Without this information, you can't optimize effectively.

    The Better Approach

    Analyze query execution plans.

    EXPLAIN ANALYZE
    SELECT * FROM orders WHERE customer_id = 123;

    This shows you exactly how the database plans to execute the query, which indexes it uses, and how many rows it will scan.

    The Ignoring Database Constraints Anti-Pattern

    Constraints enforce data integrity, but they're often overlooked in favor of application-level validation.

    The Problem

    -- Bad: No constraints, relying on application code
    CREATE TABLE users (
      id INT PRIMARY KEY,
      email VARCHAR(100),
      age INT
    );

    Without constraints, you can have duplicate emails, negative ages, or other invalid data that your application code must catch.

    The Better Approach

    Use database constraints.

    -- Good: Using constraints for data integrity
    CREATE TABLE users (
      id INT PRIMARY KEY,
      email VARCHAR(100) UNIQUE NOT NULL,
      age INT CHECK (age >= 0 AND age <= 150),
      created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );

    Database constraints are enforced at the database level, which is more reliable than application-level validation.

    The One-Size-Fits-All Schema Anti-Pattern

    Using the same database schema across all environments or applications ignores the different needs of development, staging, and production.

    The Problem

    -- Bad: Same schema for dev, staging, and production
    CREATE TABLE users (
      id INT PRIMARY KEY,
      -- Same schema everywhere, regardless of needs
    );

    Development databases might need more verbose logging, staging might need test data, and production needs optimized indexes and constraints.

    The Better Approach

    Use environment-specific schemas.

    -- Development: More detailed logging
    CREATE TABLE users (
      id INT PRIMARY KEY,
      email VARCHAR(100),
      created_at TIMESTAMP,
      created_by VARCHAR(50),  -- Only in dev
      last_modified TIMESTAMP,
      last_modified_by VARCHAR(50)  -- Only in dev
    );
     
    -- Production: Optimized for performance
    CREATE TABLE users (
      id INT PRIMARY KEY,
      email VARCHAR(100) UNIQUE NOT NULL,
      created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
    );

    Or use database migrations to apply different schemas based on the environment.

    The No Backup Strategy Anti-Pattern

    This is the anti-pattern that gets you fired. Not having a backup strategy means you have no recovery plan for data loss, corruption, or disasters.

    The Problem

    You've never tested your backups. You don't know if they're complete or usable. And when disaster strikes, you're scrambling with no recovery plan.

    The Better Approach

    Implement a comprehensive backup strategy.

    • Regular backups: Schedule automated backups at appropriate intervals
    • Multiple backup types: Full, incremental, and differential backups
    • Backup testing: Regularly test restoring backups to ensure they're valid
    • Off-site storage: Keep backups in a separate location from the database
    • Retention policy: Define how long to keep backups based on your needs

    Platforms like ServerlessBase can help automate backup management and monitoring, so you don't have to worry about it.

    Conclusion

    Database anti-patterns are easy to fall into, especially when you're under time pressure or working with legacy code. The key is to recognize them early and refactor before they become entrenched.

    The most common anti-patterns are:

    1. Denormalization - Use proper normalization and let the database handle joins efficiently
    2. **SELECT *** - Always specify the columns you need
    3. Monolithic tables - Split data into focused, single-purpose tables
    4. Missing primary keys - Use surrogate keys for stability
    5. Over-indexing - Index only what you query
    6. N+1 queries - Use joins instead of loops
    7. Ignoring isolation levels - Understand and use appropriate transaction isolation
    8. No foreign key indexes - Index all foreign key columns
    9. Hardcoded values - Use configuration instead of magic numbers
    10. Wrong data types - Use appropriate types for your data
    11. No query analysis - Always check execution plans
    12. No constraints - Let the database enforce data integrity
    13. One-size-fits-all schemas - Use environment-specific schemas
    14. No backup strategy - Implement and test backups regularly

    The next time you're designing a database schema or reviewing existing code, ask yourself: "Is this a pattern or an anti-pattern?" If it feels like it's making things harder than they need to be, it probably is.

    Remember: good database design isn't about being clever—it's about being maintainable, performant, and reliable. Your future self (and your team) will thank you.

    Leave comment