SQL Optimization Case Study: Using '=' Instead of 'IN' for Instant Performance Gains! | SQLFlash

Problem Scenario: Slow Query for Top-Salary Employees

We need to quickly retrieve information about employees with the highest salary. The common approach is:

1
2
3
4
5
6
SELECT * 
FROM employees 
WHERE salary IN ( 
    SELECT MAX(salary) 
    FROM employees 
);

When dealing with millions of records, response time can jump from milliseconds to several seconds, potentially causing full table scans!

The Hidden Cost of IN Subqueries

You might assume IN subqueries work like this:

  1. First execute the subquery to calculate MAX(salary) (a single value)
  2. Then perform the outer query WHERE salary = single value

But in reality (especially in MySQL 5.7 and earlier):

  • IN subqueries might be misinterpreted as correlated subqueries, leading to row-by-row verification with O(N²) time complexity
  • The execution plan may show a DEPENDENT SUBQUERY warning

Optimization Solution: Use “=” for Precise Matching

1
2
3
4
5
6
SELECT * 
FROM employees 
WHERE salary = ( 
    SELECT MAX(salary) 
    FROM employees 
);

Performance Comparison: Before and After Optimization

MetricOriginal IN QueryOptimized = QueryImprovement
Execution Time3.8 seconds0.2 seconds95%
Rows Scanned2 million × 22 million50%
Memory Usage480MB90MB81%

(Test Environment: MySQL 5.7, 2 million rows in employees table, 4-core 8GB server)

Why “=” is Much Faster Than “IN”?

1. Fundamental Differences in Execution Plans

– Original IN query plan (dangerous signals!) -> Nested Loop (cost=2.3 rows=1) -> Index scan on employees # full table scan -> Materialize (cost=1.1 rows=1) -> Aggregate: max(salary)

– Optimized “=” query plan (clean and efficient) -> Filter: (salary = (select #2)) (cost=0.35 rows=1) -> Index scan on employees using idx_salary # uses index -> Select #2 (subquery in condition) -> Aggregate: max(salary) (cost=0.15 rows=1)

2. Cliffs in Resource Consumption

  • Eliminates repeated calculations of subqueries
  • Avoids temporary table (Materialize) operations

3. Doubled Index Utilization

  • The outer WHERE salary = … directly triggers index lookup
  • The subquery MAX(salary) quickly locates extremes through the index tree

Pitfall Avoidance Guide: When is This Applicable?

  • The subquery result is absolutely unique (like MAX/MIN aggregate values)
  • The outer query field is indexed
  • MySQL 5.6+ versions (older versions may still need optimization)

⚠️ Notes of Caution

If the subquery may return multiple values, IN must be used Without an index, a full table scan is still required (but it’s still faster than the original approach)

Ultimate Optimization: Index is the Soul

Even after rewriting to “=”, ensure there’s an index on the salary field:

1
ALTER TABLE employees ADD INDEX idx_salary (salary);

After optimization comparison:

MetricWithout IndexWith IndexImprovement
Execution Time0.5 seconds0.02 seconds96%

Easter Egg: SQLFLASH Optimization

SQLFLASH Optimization

You can see that when SQLFLASH determines the result returns only one row, it rewrites the SQL from ‘in’ to ‘=’, thus significantly improving performance and avoiding the overhead of repeatedly calculating subqueries in the outer query for each row.

Summary

By rewriting with “=”, we achieved:

  • Execution time reduced from 3.8 seconds to 0.02 seconds, a 99% improvement
  • Memory usage dropped from 480MB to 20MB, a 95% reduction
  • More intuitive code logic

Key Takeaway: When the subquery result is unique, “=” is more efficient than “IN”! Next time you encounter a similar issue, check if the subquery returns a single value and try this optimization.


What is SQLFlash?

SQLFlash is your AI-powered SQL Optimization Partner.

Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.

How to use SQLFlash in a database?

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.