ServerlessBase Blog
  • Understanding MongoDB Indexes and Query Optimization

    A comprehensive guide to MongoDB indexes and query optimization techniques for improving database performance

    Understanding MongoDB Indexes and Query Optimization

    You've probably experienced the frustration of a slow database query. You write a simple query, expect it to return results instantly, and then watch the cursor spin for several seconds—or worse, minutes. This is where MongoDB indexes become critical. Without proper indexing, your database performs full collection scans, reading every document to find matching records. With the right indexes, the same query can complete in milliseconds.

    MongoDB indexes work similarly to indexes in books or databases: they create a sorted lookup structure that allows the database to skip directly to the relevant data rather than scanning everything. This article covers how indexes work, when to use them, and practical techniques for optimizing your MongoDB queries.

    How MongoDB Indexes Work

    MongoDB uses a B-tree data structure for its indexes. When you create an index on a field, MongoDB builds a balanced tree where each node contains a key and a pointer to the document. This structure allows for efficient range queries and equality checks.

    Consider this simple example. Without an index on the email field, MongoDB must scan every document in the collection to find users with a specific email address. With an index on email, MongoDB can navigate directly to the matching document using the B-tree structure.

    # Create an index on the email field
    db.users.createIndex({ email: 1 })
     
    # Verify the index exists
    db.users.getIndexes()

    The output shows the index details, including the index name, key pattern, and unique constraint. The 1 in the key pattern indicates ascending order; you can use -1 for descending order.

    Index Types and Use Cases

    MongoDB supports several index types, each optimized for different query patterns. Understanding which type to use prevents performance issues and unnecessary storage overhead.

    Single Field Indexes

    Single field indexes are the most basic type. They index a single field and are useful for equality and range queries on that field.

    // Create a single field index
    db.products.createIndex({ category: 1 })
     
    // Query using the index
    db.products.find({ category: "electronics" })

    Single field indexes are automatically created for unique fields when you define a unique constraint.

    Compound Indexes

    Compound indexes index multiple fields together. The order of fields in the index matters significantly. MongoDB uses the leftmost prefix rule: indexes are most effective for queries that start with the leftmost fields in the index definition.

    // Create a compound index on category and price
    db.products.createIndex({ category: 1, price: 1 })
     
    // This query uses the index efficiently
    db.products.find({ category: "electronics", price: { $lt: 100 } })
     
    // This query does NOT use the index effectively
    db.products.find({ price: { $lt: 100 }, category: "electronics" })

    The second query fails to use the index because the price field comes before category in the query, but category is the leftmost field in the index.

    Multikey Indexes

    Multikey indexes automatically index every value in an array field. This is particularly useful for querying embedded arrays.

    // Create a multikey index on tags
    db.articles.createIndex({ tags: 1 })
     
    // Query using the multikey index
    db.articles.find({ tags: "mongodb" })

    MongoDB creates a separate index entry for each array element, allowing efficient queries on array contents.

    Text Indexes

    Text indexes support full-text search functionality. They tokenize text fields and create inverted indexes for fast text search.

    // Create a text index
    db.articles.createIndex({ content: "text" })
     
    // Perform a text search
    db.articles.find(
      { content: { $text: { $search: "mongodb optimization" } } },
      { score: { $meta: "textScore" } }
    )

    Text indexes are useful for search features but have higher storage overhead and don't support range queries.

    Geospatial Indexes

    Geospatial indexes enable location-based queries. MongoDB supports several coordinate systems and query types.

    // Create a 2dsphere index for geospatial queries
    db.locations.createIndex({ location: "2dsphere" })
     
    // Query within a radius
    db.locations.find({
      location: {
        $geoWithin: {
          $centerSphere: [[-73.97, 40.77], 10 / 6371]
        }
      }
    })

    Index Selection and Query Analysis

    Choosing the right indexes requires understanding your query patterns. MongoDB provides tools to analyze query performance and determine which indexes are being used.

    Using Explain()

    The explain() method shows how MongoDB executes a query and which indexes it uses.

    // Analyze query execution
    db.users.find({ email: "user@example.com" }).explain("executionStats")

    The execution stats include:

    • executionTimeMillis: Total query execution time
    • totalDocsExamined: Number of documents scanned
    • totalKeysExamined: Number of index keys examined
    • stage: The query stage (e.g., IXSCAN for index scan)

    A good index will show totalDocsExamined close to the number of matching documents, while a poor index will show a much higher number.

    Index Usage Patterns

    MongoDB uses indexes efficiently for equality queries on the leftmost index fields. Range queries work well on subsequent fields. However, certain query patterns don't use indexes effectively:

    // This query uses the index efficiently
    db.products.find({ category: "electronics" })
     
    // This query uses the index partially
    db.products.find({ category: "electronics", price: { $gt: 50 } })
     
    // This query does NOT use the index
    db.products.find({ price: { $gt: 50 }, category: "electronics" })

    The third query fails to use the index because the equality condition is on the rightmost field, not the leftmost.

    Index Optimization Strategies

    Optimizing MongoDB indexes involves balancing query performance with storage overhead and write performance.

    Index Cardinality

    Index cardinality refers to the number of unique values in a field. High cardinality fields (many unique values) make effective indexes. Low cardinality fields (few unique values) create less effective indexes.

    // High cardinality - good for indexing
    db.users.createIndex({ email: 1 })
     
    // Low cardinality - less effective index
    db.users.createIndex({ status: 1 })

    Fields with few unique values (like status flags) benefit less from indexes and may actually slow down writes due to index maintenance overhead.

    Index Selectivity

    Index selectivity measures how well an index filters documents. High selectivity indexes return few matching documents, making them more effective.

    // High selectivity - filters many documents
    db.users.createIndex({ email: 1 })
     
    // Low selectivity - returns many documents
    db.users.createIndex({ age: 1 })

    For low selectivity fields, consider compound indexes with higher selectivity fields to improve query performance.

    Index Prefixes

    Compound indexes work best when queries use the leftmost index fields. This is known as the index prefix rule.

    // Create a compound index
    db.orders.createIndex({ customer_id: 1, order_date: 1, total: 1 })
     
    // This query uses the index efficiently
    db.orders.find({ customer_id: 123, order_date: { $gte: "2026-01-01" } })
     
    // This query does NOT use the index efficiently
    db.orders.find({ order_date: { $gte: "2026-01-01" }, total: { $gt: 100 } })

    The second query fails to use the index because the equality condition is on the rightmost field.

    Practical Indexing Walkthrough

    Let's walk through a practical example of optimizing a MongoDB collection for an e-commerce application.

    Scenario

    You have an orders collection with the following structure:

    {
      _id: ObjectId("..."),
      customer_id: NumberInt(123),
      order_date: ISODate("2026-01-15T10:30:00Z"),
      total: NumberDecimal("99.99"),
      status: "shipped",
      items: [
        { product_id: 456, quantity: 2, price: NumberDecimal("49.99") }
      ]
    }

    You need to support these queries:

    1. Get all orders for a specific customer
    2. Get orders for a customer within a date range
    3. Get orders for a customer with total above a threshold
    4. Get recent orders across all customers

    Initial Index Strategy

    Start by creating indexes for the most common queries:

    // Index for customer-specific queries
    db.orders.createIndex({ customer_id: 1, order_date: 1 })
     
    // Index for total-based queries
    db.orders.createIndex({ total: 1 })
     
    // Index for status-based queries
    db.orders.createIndex({ status: 1 })

    Query Analysis

    Analyze each query to verify index usage:

    // Query 1: Customer-specific orders
    db.orders.find({ customer_id: 123 }).explain("executionStats")
     
    // Query 2: Customer orders in date range
    db.orders.find({
      customer_id: 123,
      order_date: { $gte: ISODate("2026-01-01") }
    }).explain("executionStats")
     
    // Query 3: Orders above threshold
    db.orders.find({ total: { $gt: NumberDecimal("100") } }).explain("executionStats")
     
    // Query 4: Recent orders
    db.orders.find({}).sort({ order_date: -1 }).limit(10).explain("executionStats")

    Optimization Refinements

    After analyzing the queries, you might discover that some indexes are redundant or not being used effectively. Consider these refinements:

    // Remove unused indexes
    db.orders.dropIndex("total_1")
     
    // Create a compound index for common query patterns
    db.orders.createIndex({ status: 1, order_date: -1 })
     
    // Create a sparse index for rarely used queries
    db.orders.createIndex({ customer_id: 1, total: 1 }, { sparse: true })

    Monitoring Index Usage

    Regularly monitor index usage to identify unused or underutilized indexes:

    // Check index usage statistics
    db.orders.aggregate([
      {
        $indexStats: {}
      }
    ])
     
    // Identify indexes not used in the last 7 days
    db.orders.aggregate([
      { $indexStats: {} },
      {
        $match: {
          "accesses.accesses": { $lt: 1000 }
        }
      }
    ])

    Index Maintenance and Best Practices

    Proper index maintenance ensures optimal database performance over time.

    Index Cleanup

    Remove indexes that are no longer needed:

    // Drop unused indexes
    db.orders.dropIndex("total_1")
     
    # List all indexes
    db.orders.getIndexes()

    Index Rebuilding

    Rebuild indexes periodically to defragment and optimize storage:

    // Rebuild all indexes
    db.orders.reIndex()
     
    # Alternative: Drop and recreate indexes
    db.orders.dropIndexes()
    db.orders.createIndex({ customer_id: 1, order_date: 1 })

    Index Size Monitoring

    Monitor index size to prevent excessive storage usage:

    // Check index sizes
    db.orders.aggregate([
      {
        $indexStats: {}
      },
      {
        $group: {
          _id: "$name",
          size: { $sum: "$size" },
          count: { $sum: "$accesses.accesses" }
        }
      }
    ])

    Common Indexing Pitfalls

    Avoid these common mistakes when working with MongoDB indexes:

    Over-indexing

    Creating too many indexes slows down write operations and increases storage requirements. Only create indexes for queries that actually benefit from them.

    Ignoring Index Prefixes

    Compound indexes only work efficiently for queries that start with the leftmost index fields. Always consider query patterns when designing compound indexes.

    Low Cardinality Fields

    Indexing fields with few unique values (like status flags) provides minimal benefit and adds unnecessary overhead.

    Ignoring Index Usage

    Regularly analyze query execution plans to ensure indexes are being used effectively. Unused indexes waste storage and slow down writes.

    Conclusion

    MongoDB indexes are essential for query performance, but they require careful design and maintenance. The key principles are:

    1. Understand your query patterns before creating indexes
    2. Use the right index type for your use case
    3. Consider index cardinality and selectivity when choosing fields to index
    4. Monitor index usage and remove unused indexes
    5. Balance read performance with write performance and storage overhead

    Platforms like ServerlessBase simplify database management by providing automated monitoring and optimization tools, but understanding the fundamentals of indexing remains essential for effective database performance tuning.

    The next step is to analyze your actual query patterns and create indexes that address your specific use cases. Start with the most common queries, measure their performance, and iteratively refine your index strategy based on real-world usage data.

    Leave comment