Database performance often comes down to proper indexing. A well-placed index can transform a slow query from seconds to milliseconds, while poorly designed indexes can degrade write performance and waste storage.
Understanding Index Fundamentals
Indexes are data structures that databases use to quickly locate rows without scanning entire tables. Think of them as a book's index—instead of reading every page to find a topic, you look it up and jump directly to the right page.
When to Add an Index
Create indexes for columns that appear in:
- WHERE clauses - Filtering conditions
- JOIN conditions - Columns used to connect tables
- ORDER BY clauses - Sorting operations
- GROUP BY clauses - Aggregation operations
Index Types Explained
B-Tree Indexes (Default)
B-Tree indexes are the workhorses of relational databases, handling most query patterns efficiently.
-- PostgreSQL B-Tree index (default type) CREATE INDEX idx_users_email ON users(email); -- MySQL equivalent CREATE INDEX idx_users_email ON users(email); -- Query that benefits from this index SELECT * FROM users WHERE email = 'user@example.com';
Best for:
- Equality comparisons (
=) - Range queries (
<,>,BETWEEN) - Pattern matching (
LIKE 'prefix%') - Sorting (
ORDER BY)
Composite Indexes
Composite indexes span multiple columns and follow left-to-right matching rules.
-- Index on last_name, first_name CREATE INDEX idx_users_name ON users(last_name, first_name); -- These queries use the index: SELECT * FROM users WHERE last_name = 'Smith'; SELECT * FROM users WHERE last_name = 'Smith' AND first_name = 'John'; -- This query does NOT use the index (missing left-most column): SELECT * FROM users WHERE first_name = 'John';
Column order matters:
- Put the most selective column first
- Place columns used in equality before range conditions
- Consider query frequency when ordering
Partial Indexes (PostgreSQL)
Partial indexes only index rows matching a WHERE condition, reducing index size and improving performance.
-- Index only active users CREATE INDEX idx_active_users ON users(email) WHERE status = 'active'; -- Index only recent orders CREATE INDEX idx_recent_orders ON orders(created_at) WHERE created_at > '2025-01-01'; -- Query that uses the partial index SELECT * FROM users WHERE status = 'active' AND email = 'user@example.com';
Benefits:
- Smaller index size
- Faster writes
- More efficient for specific query patterns
Full-Text Search Indexes
Full-text indexes enable efficient text searching across large content fields.
-- PostgreSQL full-text search CREATE INDEX idx_posts_content_fts ON posts USING gin(to_tsvector('english', content)); -- Search query SELECT * FROM posts WHERE to_tsvector('english', content) @@ to_tsquery('database & performance'); -- MySQL full-text index CREATE FULLTEXT INDEX idx_posts_content ON posts(content); -- MySQL search query SELECT * FROM posts WHERE MATCH(content) AGAINST('database performance' IN NATURAL LANGUAGE MODE);
Index Optimization Patterns
Covering Indexes
A covering index includes all columns needed by a query, allowing the database to satisfy the query entirely from the index without accessing the table.
-- Create covering index CREATE INDEX idx_orders_user_covering ON orders(user_id, created_at, total_amount); -- Query satisfied entirely from index SELECT created_at, total_amount FROM orders WHERE user_id = 123 ORDER BY created_at DESC;
Index on Expressions (PostgreSQL)
Index computed values for queries that filter on expressions.
-- Index on lowercase email for case-insensitive searches CREATE INDEX idx_users_email_lower ON users(LOWER(email)); -- Query using the expression index SELECT * FROM users WHERE LOWER(email) = 'user@example.com'; -- Index on extracted date part CREATE INDEX idx_orders_date ON orders(DATE(created_at)); SELECT * FROM orders WHERE DATE(created_at) = '2026-02-15';
Performance Monitoring
Identify Missing Indexes
-- PostgreSQL: Find tables with sequential scans SELECT schemaname, tablename, seq_scan, seq_tup_read, idx_scan, seq_tup_read / seq_scan as avg_seq_read FROM pg_stat_user_tables WHERE seq_scan > 0 ORDER BY seq_tup_read DESC LIMIT 20;
-- MySQL: Check table statistics SELECT table_schema, table_name, table_rows, data_length, index_length, ROUND(index_length / data_length, 2) as index_ratio FROM information_schema.tables WHERE table_schema NOT IN ('mysql', 'information_schema', 'performance_schema') ORDER BY data_length DESC;
Identify Unused Indexes
-- PostgreSQL: Find unused indexes SELECT schemaname, tablename, indexname, idx_scan, pg_size_pretty(pg_relation_size(indexrelid)) as index_size FROM pg_stat_user_indexes WHERE idx_scan = 0 AND indexrelname NOT LIKE 'pg_toast%' ORDER BY pg_relation_size(indexrelid) DESC;
Index Maintenance Best Practices
1. Monitor Index Bloat
-- PostgreSQL: Check index bloat SELECT schemaname, tablename, indexname, pg_size_pretty(pg_relation_size(indexrelid)) as size, idx_scan, idx_tup_read, idx_tup_fetch FROM pg_stat_user_indexes ORDER BY pg_relation_size(indexrelid) DESC;
2. Reindex Periodically
-- PostgreSQL: Rebuild single index REINDEX INDEX idx_users_email; -- Rebuild all indexes on a table REINDEX TABLE users; -- MySQL: Rebuild indexes ALTER TABLE users ENGINE=InnoDB;
3. Consider Index Size vs. Benefit
A good rule of thumb:
- Indexes should be used frequently (daily)
- Read performance gain should outweigh write cost
- Index size should be reasonable relative to table size
Common Pitfalls
Over-indexing:
- Each index slows down INSERT, UPDATE, and DELETE operations
- Indexes consume storage space
- More indexes mean more maintenance overhead
Wrong column order in composite indexes:
- Remember left-to-right matching
- Place high-selectivity columns first
Indexing low-cardinality columns:
- Columns with few unique values (boolean, status) benefit less from indexing
- Consider partial indexes for these cases
Conclusion
Effective indexing requires balancing read and write performance. Monitor your query patterns, identify slow queries, and add indexes strategically. Remember: every index has a cost, so measure the impact and remove unused indexes regularly. With proper indexing strategy, you can achieve sub-millisecond query times even on tables with millions of rows.