Database Indexing Strategies for High-Performance Applications

Database performance often comes down to proper indexing. A well-placed index can transform a slow query from seconds to milliseconds, while poorly designed indexes can degrade write performance and waste storage.

Understanding Index Fundamentals

Indexes are data structures that databases use to quickly locate rows without scanning entire tables. Think of them as a book's index—instead of reading every page to find a topic, you look it up and jump directly to the right page.

When to Add an Index

Create indexes for columns that appear in:

WHERE clauses - Filtering conditions
JOIN conditions - Columns used to connect tables
ORDER BY clauses - Sorting operations
GROUP BY clauses - Aggregation operations

Index Types Explained

B-Tree Indexes (Default)

B-Tree indexes are the workhorses of relational databases, handling most query patterns efficiently.

-- PostgreSQL B-Tree index (default type)
CREATE INDEX idx_users_email ON users(email);

-- MySQL equivalent
CREATE INDEX idx_users_email ON users(email);

-- Query that benefits from this index
SELECT * FROM users WHERE email = 'user@example.com';

Best for:

Equality comparisons (=)
Range queries (<, >, BETWEEN)
Pattern matching (LIKE 'prefix%')
Sorting (ORDER BY)

Composite Indexes

Composite indexes span multiple columns and follow left-to-right matching rules.

-- Index on last_name, first_name
CREATE INDEX idx_users_name ON users(last_name, first_name);

-- These queries use the index:
SELECT * FROM users WHERE last_name = 'Smith';
SELECT * FROM users WHERE last_name = 'Smith' AND first_name = 'John';

-- This query does NOT use the index (missing left-most column):
SELECT * FROM users WHERE first_name = 'John';

Column order matters:

Put the most selective column first
Place columns used in equality before range conditions
Consider query frequency when ordering

Partial Indexes (PostgreSQL)

Partial indexes only index rows matching a WHERE condition, reducing index size and improving performance.

-- Index only active users
CREATE INDEX idx_active_users 
ON users(email) 
WHERE status = 'active';

-- Index only recent orders
CREATE INDEX idx_recent_orders 
ON orders(created_at) 
WHERE created_at > '2025-01-01';

-- Query that uses the partial index
SELECT * FROM users 
WHERE status = 'active' AND email = 'user@example.com';

Benefits:

Smaller index size
Faster writes
More efficient for specific query patterns

Full-Text Search Indexes

Full-text indexes enable efficient text searching across large content fields.

-- PostgreSQL full-text search
CREATE INDEX idx_posts_content_fts 
ON posts 
USING gin(to_tsvector('english', content));

-- Search query
SELECT * FROM posts 
WHERE to_tsvector('english', content) @@ to_tsquery('database & performance');

-- MySQL full-text index
CREATE FULLTEXT INDEX idx_posts_content ON posts(content);

-- MySQL search query
SELECT * FROM posts 
WHERE MATCH(content) AGAINST('database performance' IN NATURAL LANGUAGE MODE);

Index Optimization Patterns

Covering Indexes

A covering index includes all columns needed by a query, allowing the database to satisfy the query entirely from the index without accessing the table.

-- Create covering index
CREATE INDEX idx_orders_user_covering 
ON orders(user_id, created_at, total_amount);

-- Query satisfied entirely from index
SELECT created_at, total_amount 
FROM orders 
WHERE user_id = 123 
ORDER BY created_at DESC;

Index on Expressions (PostgreSQL)

Index computed values for queries that filter on expressions.

-- Index on lowercase email for case-insensitive searches
CREATE INDEX idx_users_email_lower ON users(LOWER(email));

-- Query using the expression index
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';

-- Index on extracted date part
CREATE INDEX idx_orders_date ON orders(DATE(created_at));

SELECT * FROM orders WHERE DATE(created_at) = '2026-02-15';

Performance Monitoring

Identify Missing Indexes

-- PostgreSQL: Find tables with sequential scans
SELECT 
  schemaname,
  tablename,
  seq_scan,
  seq_tup_read,
  idx_scan,
  seq_tup_read / seq_scan as avg_seq_read
FROM pg_stat_user_tables
WHERE seq_scan > 0
ORDER BY seq_tup_read DESC
LIMIT 20;

-- MySQL: Check table statistics
SELECT 
  table_schema,
  table_name,
  table_rows,
  data_length,
  index_length,
  ROUND(index_length / data_length, 2) as index_ratio
FROM information_schema.tables
WHERE table_schema NOT IN ('mysql', 'information_schema', 'performance_schema')
ORDER BY data_length DESC;

Identify Unused Indexes

-- PostgreSQL: Find unused indexes
SELECT 
  schemaname,
  tablename,
  indexname,
  idx_scan,
  pg_size_pretty(pg_relation_size(indexrelid)) as index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND indexrelname NOT LIKE 'pg_toast%'
ORDER BY pg_relation_size(indexrelid) DESC;

Index Maintenance Best Practices

1. Monitor Index Bloat

-- PostgreSQL: Check index bloat
SELECT 
  schemaname,
  tablename,
  indexname,
  pg_size_pretty(pg_relation_size(indexrelid)) as size,
  idx_scan,
  idx_tup_read,
  idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC;

2. Reindex Periodically

-- PostgreSQL: Rebuild single index
REINDEX INDEX idx_users_email;

-- Rebuild all indexes on a table
REINDEX TABLE users;

-- MySQL: Rebuild indexes
ALTER TABLE users ENGINE=InnoDB;

3. Consider Index Size vs. Benefit

A good rule of thumb:

Indexes should be used frequently (daily)
Read performance gain should outweigh write cost
Index size should be reasonable relative to table size

Common Pitfalls

Over-indexing:

Each index slows down INSERT, UPDATE, and DELETE operations
Indexes consume storage space
More indexes mean more maintenance overhead

Wrong column order in composite indexes:

Remember left-to-right matching
Place high-selectivity columns first

Indexing low-cardinality columns:

Columns with few unique values (boolean, status) benefit less from indexing
Consider partial indexes for these cases

Conclusion

Effective indexing requires balancing read and write performance. Monitor your query patterns, identify slow queries, and add indexes strategically. Remember: every index has a cost, so measure the impact and remove unused indexes regularly. With proper indexing strategy, you can achieve sub-millisecond query times even on tables with millions of rows.