Understanding EXPLAIN
Read the plan, find the seq-scan, add the index.
What an EXPLAIN tree means, the operations to recognise, and the canonical "why is this slow" diagnoses.
The plan tree.
A query plan is a tree of operations. Leaves are scans (read from a table or index); inner nodes are joins, sorts, aggregates, filters. Postgres prints it bottom-up; MySQL prints it as a table. Each node lists its estimated rows, estimated cost, and (with EXPLAIN ANALYZE) actual rows and actual time. The discrepancy between estimate and reality is often where the bug lives.
The four scan types.
Seq Scan: read every row of the table. Fine on tiny tables, disastrous on large ones. Index Scan: walk an index to find matching rows, then look each up in the table. Index Only Scan: the index alone has all the columns the query needs; no table lookup. Bitmap Heap Scan: build a bitmap of matching rows from one or more indexes, then read them in disk-order. The fastest scan depends on selectivity — a query returning 5 % of a table favours index scan; 50 % favours seq scan.
The three join types.
Nested Loop: for each row from the outer, scan the inner. Good when one side is small and the inner has a useful index. Hash Join: build a hash table from one side, probe with rows from the other. Good for medium-large joins with no useful index. Merge Join: both sides sorted, walk both in parallel. Best when both sides come pre-sorted (often from indexes). The planner picks; sometimes incorrectly because the estimates are wrong.
A worked plan.
Query: SELECT * FROM orders WHERE user_id = 42 AND created_at > '2026-01-01'. Plan: Index Scan using orders_user_id_idx (cost 0.43..8.45, rows 3) Filter: created_at > '2026-01-01'. Reading: the index gave us all orders for user 42 (3 rows), then a filter dropped any older than 2026. With a composite index on (user_id, created_at), the plan becomes Index Scan with both conditions matched inside the index — faster. The diagnosis is "missing composite index", the fix is one CREATE INDEX.
orders by user + date
single → composite index
Filter outside index → filter inside index.
Index Scan + Filter → Index Scan only
= One less heap visit per row
Common diagnoses.
Seq Scan on a large table → missing index. Estimated rows wildly off → run ANALYZE to refresh statistics, consider extended statistics on correlated columns. Nested Loop with high inner cost → planner should have picked Hash Join, possibly because join-collapse-limit is low or estimates are wrong. Sort node spilling to disk → work_mem is too small or query result is too large; add an index that provides ordering. The catalogue of patterns is small enough to memorise.
EXPLAIN ANALYZE vs EXPLAIN.
EXPLAIN shows the planner's plan and estimates without running the query. EXPLAIN ANALYZE runs the query and adds actual rows and actual time. For diagnosis, ANALYZE is the one you want. Caveats: it actually executes the query (don't run on UPDATE/DELETE in production), and the first run hits cold cache so the numbers are pessimistic. Run twice; report the second.