SQL Query Optimization

Category: Database October 1, 2025 Source: awesome-copilot

Analyze and optimize SQL queries for better performance, with execution plans and indexing recommendations.

SQLDatabasePerformanceOptimization
# SQL Query Optimization

Analyze SQL queries and provide comprehensive optimization recommendations to improve performance, reduce resource usage, and follow best practices.

## Analysis Areas

### 1. Query Performance Analysis

**Execution Plan Review**
- Analyze query execution plan
- Identify sequential scans (table scans)
- Look for missing indexes
- Check for expensive operations (sorts, nested loops)
- Identify Cartesian products

**Performance Metrics**
- Execution time
- Number of rows scanned vs returned
- CPU usage
- Memory consumption
- I/O operations

### 2. Query Structure Optimization

**SELECT Statement Optimization**
- Avoid `SELECT *` - specify only needed columns
- Use column aliases for clarity
- Eliminate duplicate columns
- Consider computed columns for complex expressions

**JOIN Optimization**
- Use appropriate JOIN types (INNER, LEFT, RIGHT)
- Ensure JOIN conditions are on indexed columns
- Consider JOIN order (join smaller tables first)
- Avoid implicit joins (use explicit JOIN syntax)
- Replace subqueries with JOINs when appropriate

**WHERE Clause Optimization**
- Put most selective filters first
- Use indexed columns in WHERE conditions
- Avoid functions on indexed columns (prevents index usage)
- Use `IN` instead of multiple `OR` conditions
- Use `EXISTS` instead of `COUNT` for existence checks
- Avoid `NOT IN` with nullable columns

**Subquery Optimization**
- Convert correlated subqueries to JOINs when possible
- Use `EXISTS` instead of `IN` for large datasets
- Consider CTEs (Common Table Expressions) for readability
- Use window functions instead of subqueries when applicable

### 3. Indexing Strategy

**Index Recommendations**
- Identify missing indexes for frequently queried columns
- Recommend composite indexes for multi-column filters
- Consider covering indexes for query optimization
- Suggest partial indexes for filtered queries
- Identify unused or redundant indexes to remove

**Index Design Guidelines**
- Index columns used in WHERE, JOIN, and ORDER BY
- Consider index column order (most selective first)
- Balance index benefits vs write performance cost
- Use appropriate index types (B-tree, Hash, GiST, GIN)

### 4. Advanced Optimizations

**Aggregation Optimization**
- Use indexed columns in GROUP BY
- Consider materialized views for complex aggregations
- Use window functions efficiently
- Optimize HAVING clauses

**Pagination Optimization**
- Use keyset pagination instead of OFFSET for large datasets
- Index ORDER BY columns
- Avoid counting total rows if not needed

**Data Type Optimization**
- Use appropriate data types (avoid VARCHAR for numeric data)
- Consider ENUM types for fixed value sets
- Use appropriate size constraints

### 5. Database-Specific Features

**PostgreSQL**
- Use EXPLAIN ANALYZE for detailed execution plans
- Consider VACUUM and ANALYZE for statistics
- Use partial indexes and expression indexes
- Leverage CTEs with MATERIALIZED hint

**MySQL**
- Use EXPLAIN to analyze queries
- Consider query cache (MySQL 5.x)
- Use STRAIGHT_JOIN to control join order
- Optimize InnoDB buffer pool settings

**SQL Server**
- Use execution plan analysis tools
- Consider query store for performance tracking
- Use included columns in nonclustered indexes
- Optimize tempdb configuration

## Optimization Workflow

### Step 1: Gather Information
```sql
-- Get current query execution plan
EXPLAIN ANALYZE
[your query here]

Step 2: Identify Issues

  • Long execution time
  • High row counts scanned
  • Sequential scans on large tables
  • Expensive sorts or aggregations
  • Suboptimal join methods

Step 3: Apply Optimizations

  • Add or modify indexes
  • Rewrite query structure
  • Adjust join order
  • Add query hints if needed
  • Update statistics

Step 4: Validate Improvements

-- Compare before and after execution plans
-- Measure execution time reduction
-- Verify query results remain correct

Output Format

For each query analyzed, provide:

  1. Current Performance

    • Execution time
    • Rows scanned/returned
    • Key issues identified
  2. Recommended Changes

    • Specific code modifications
    • Index recommendations with DDL
    • Configuration suggestions
  3. Expected Improvement

    • Estimated performance gain
    • Impact on write operations
    • Resource tradeoffs
  4. Implementation Priority

    • High: Immediate performance impact
    • Medium: Moderate improvement
    • Low: Minor optimization

Best Practices Checklist

  • Use appropriate indexes on all JOIN and WHERE columns
  • Avoid functions on indexed columns in WHERE clauses
  • Select only needed columns, not SELECT *
  • Use explicit JOINs instead of implicit joins
  • Consider query result caching for frequently-run queries
  • Regularly update database statistics (ANALYZE)
  • Monitor and tune database configuration parameters
  • Use connection pooling for high-traffic applications
  • Implement query timeouts to prevent runaway queries
  • Profile queries in production-like environment
  • Document complex queries with comments
  • Use prepared statements to prevent SQL injection