How to Read and Understand Query Execution Plans
You've written a query that runs in seconds, but after adding a few more conditions or joining another table, it suddenly takes minutes. The database isn't slow — your query is asking for more work than necessary. The tool that tells you exactly what the database is doing is the execution plan.
An execution plan is a roadmap of how the database engine will execute your query. It shows the order of operations, which indexes are used, whether tables are scanned or joined, and where the bottlenecks are. Reading it correctly is a core skill for anyone working with databases.
What an Execution Plan Actually Shows
Think of an execution plan as a recipe. It tells you the steps the database will take to get your result. Each step has a cost — how much work it will do. The database optimizer tries to find the path with the lowest total cost, but it doesn't always guess right.
The plan shows several key pieces of information:
- Operation type: Scan, join, filter, aggregate, sort, etc.
- Table access method: Full table scan, index scan, index-only scan
- Index used: Which index (if any) is consulted
- Estimated rows: How many rows the database expects to process
- Estimated cost: Relative cost of the operation
- Actual rows: How many rows were actually processed (important for validation)
The most important thing to understand is that the database doesn't always pick the fastest path. It makes educated guesses based on statistics it collected about your data. If those statistics are outdated or your data distribution is skewed, the optimizer can make poor choices.
Reading the Plan Structure
Execution plans are hierarchical. The top-level operation is what you asked for, and it contains child operations that do the actual work. PostgreSQL's EXPLAIN output shows this hierarchy with indentation.
The output shows the full plan with actual execution times and row counts. The ANALYZE flag is critical — without it, you only see estimated values, not what actually happened.
The plan structure follows a logical flow:
- From clause: Which tables are accessed and in what order
- Join method: How tables are combined (nested loop, hash join, merge join)
- Filters: Where rows are eliminated early to reduce work
- Sorts and aggregates: Operations that reorder or summarize data
- Final result: The final output
Each level of the plan has a cost estimate. The total cost at the top level is the sum of all child costs. Lower cost is better, but remember these are relative estimates, not absolute performance measurements.
Common Plan Patterns and What They Mean
Full Table Scan
A full table scan reads every row in the table. This is slow for large tables but can be fast for small ones or when you need to examine all rows anyway.
The Seq Scan indicates sequential scanning. If you see this on a large table without an index, your query is likely the problem. Adding an appropriate index or rewriting the query to use one is usually the fix.
Index Scan
An index scan uses an index to find specific rows, then reads the actual table data for those rows. This is much faster than a full scan when you're filtering on indexed columns.
The users_email_idx shows which index was used. If you see an index scan but it's still slow, the index might be selective enough to find few rows, but the table access (the "Index Scan" operation itself) could be the bottleneck.
Index Only Scan
An index-only scan is the best case. The index contains all the data needed, so the database never touches the actual table. This is possible when all selected columns are in the index.
Look for "Index Only Scan" in the output. If you see this, your query is already well-optimized. If you're not getting this despite having all columns in your index, check if your index is covering (includes all needed columns).
Nested Loop Join
A nested loop join combines rows from two tables one at a time. It's efficient when one table is small and the other is large, or when there's a good index on the join column.
The outer query (users) runs once, and for each row, the inner query (orders) runs. If the outer query returns many rows, this becomes expensive. Look for nested loops with large row counts — they're often the culprit.
Hash Join
A hash join builds a hash table from the smaller input, then probes it with the larger input. This is generally faster than nested loops for large datasets.
Hash joins are usually good, but watch for the hash building cost. If the hash table is huge, the memory usage can spike and cause performance issues.
Sort and Merge Join
Sort operations reorder data, which can be expensive. A merge join is efficient when both inputs are already sorted, but if the database has to sort them first, you're paying that cost.
If you see a sort operation, ask yourself: do I really need this sorted data? If not, removing the ORDER BY clause or changing the query logic can eliminate the sort.
Practical Example: Analyzing a Slow Query
Let's walk through a real example. You have a query that joins three tables and filters by date, and it's taking too long.
The output shows:
Here's what we can see:
- Three sequential scans: The database is scanning all three tables sequentially. This is expensive.
- No indexes used: None of the join columns have indexes being used.
- Large row estimates: Each table scan is estimated to return 10,000 rows.
- Filters applied after scans: The date and product_id filters are applied after the scans, meaning the database is scanning all rows and then discarding most of them.
The fix is to add indexes on the join columns and filter columns:
After adding these indexes, the plan changes to:
Now we have:
- Index scans: The date and product_id filters use indexes.
- Reduced row estimates: The scans now return 1,000 rows instead of 10,000.
- Still a hash join: The join is still a hash join, which is acceptable, but the reduced row count makes it much faster.
Common Mistakes and How to Avoid Them
Relying Only on Estimates
The estimated rows and costs are just guesses. Always use EXPLAIN ANALYZE to see what actually happened. If the estimated rows are way off from actual rows, your statistics are outdated.
Ignoring Index Usage
If you have an index on a column you're filtering on, the query should use it. If it doesn't, check:
- Is the index actually being used? Look for "Index Scan" or "Index Only Scan" in the plan.
- Is the index covering all needed columns? If not, consider a partial index or an index with INCLUDE.
- Is the index selective enough? If the index returns too many rows, the database might skip it.
Misunderstanding Join Types
Different join types have different performance characteristics:
- Nested loop: Good for small datasets or when one table is small
- Hash join: Good for large datasets, but uses memory
- Merge join: Good when both inputs are already sorted, but requires sorting if not
Choose the right join type by understanding your data size and distribution.
Forgetting About Sort Operations
Sorts are expensive. If you see a sort in your plan, ask:
- Do I need this sorted data?
- Can I change the query to avoid the sort?
- Is there an index that can provide the data in sorted order?
Tools and Resources
PostgreSQL provides several tools for analyzing execution plans:
EXPLAIN: Shows the plan without running the queryEXPLAIN ANALYZE: Shows the plan with actual execution timesEXPLAIN (ANALYZE, BUFFERS): Shows buffer usage (important for I/O analysis)EXPLAIN (ANALYZE, VERBOSE): Shows detailed information including column names
For MySQL, use EXPLAIN or EXPLAIN FORMAT=JSON to get structured output.
For SQL Server, use EXPLAIN (or SET SHOWPLAN_TEXT ON) to see the execution plan.
Conclusion
Reading execution plans is a skill that pays dividends. Every slow query you optimize saves you time and improves your application's performance. Start by understanding the basic plan structure, then learn to recognize common patterns and their implications.
Remember that the database optimizer is smart, but it needs good statistics and appropriate indexes to make good decisions. Regularly update statistics, add indexes on frequently filtered columns, and always verify your plans with EXPLAIN ANALYZE.
If you're managing multiple databases and complex deployments, platforms like ServerlessBase can help you monitor performance and optimize your database configurations automatically, reducing the time you spend on manual tuning.
The next time a query runs slowly, don't guess — read the execution plan. It will tell you exactly what's happening under the hood and point you to the right fix.