Optimize MySQL OR Conditions: | SQLFlash

1. Introduction

In enterprise-level application systems, complex multi-layer nested queries often become performance bottlenecks. While derived tables (subqueries) can provide a logically clear SQL structure, unnecessary nesting levels increase the overhead of SQL parsing, optimization, and execution. This solution is based on actual financial query scenarios in SAP systems. By building test data with 20,000 rows, comparing execution plans, and analyzing performance metrics, it validates the optimization effect of subquery flattening techniques in eliminating derived tables. The test results show that by simplifying the query structure, while the execution plan cost remained unchanged, the actual execution time still achieved a performance improvement of 37.57%, providing developers with a practical SQL refactoring solution.

2. Test Environment Setup

2.1 Table Structure Design

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
CREATE TABLE SAPR3.BKPF (
    MANDT VARCHAR(3),
    BUKRS VARCHAR(4),
    BELNR VARCHAR(10),
    GJAHR VARCHAR(4),
    BLDAT VARCHAR(8),
    PRIMARY KEY (MANDT, BUKRS, BELNR, GJAHR)
) ENGINE=InnoDB;

CREATE TABLE SAPR3.BSIS (
    MANDT VARCHAR(3),
    BUKRS VARCHAR(4),
    BELNR VARCHAR(10),
    GJAHR VARCHAR(4),
    BUZEI VARCHAR(3),
    HKONT VARCHAR(10),
    DMBTR DECIMAL(15,2),
    WAERS VARCHAR(5),
    MONAT VARCHAR(2),
    BLDAT VARCHAR(8),
    PRIMARY KEY (MANDT, BUKRS, BELNR, GJAHR, BUZEI)
) ENGINE=InnoDB;

2. Test Data Generation

  • Generate 10,000 records of BKPF test data
  • Generate 10,000 records of BSIS test data
  • Total test data volume: 20,000 rows

3. SQL Optimization

3.1 Original SQL

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
SELECT 
    t.age, 
    CASE 
        WHEN t.age BETWEEN 0 AND 3 THEN '0-3 Days' 
        WHEN t.age BETWEEN 4 AND 7 THEN '3-7 Days' 
        ELSE 'Above 7 Days' 
    END AS age_bucket, 
    t.amount_local, 
    t.hkont AS gl_account, 
    t.bukrs AS company_code 
FROM (
    SELECT 
        b.BUKRS, 
        b.HKONT, 
        b.DMBTR AS amount_local, 
        DATEDIFF(CURDATE(), STR_TO_DATE(k.BLDAT, '%Y%m%d')) AS age 
    FROM (
        SELECT BELNR, GJAHR, BUKRS, HKONT, DMBTR 
        FROM SAPR3.BSIS 
        WHERE GJAHR = '2025' 
          AND BUKRS = '0100' 
          AND HKONT = '0000895200' 
          AND MANDT = '500' 
          AND BUZEI = '002' 
          AND WAERS = 'INR' 
          AND MONAT = '01' 
          AND BLDAT = '20240401'
    ) b 
    JOIN SAPR3.BKPF k 
      ON b.BELNR = k.BELNR 
     AND b.GJAHR = k.GJAHR 
     AND b.BUKRS = k.BUKRS
) t 
ORDER BY t.age;

3.2 Optimized SQL

We used SQLFlash to rewrite the query, and the resulting optimized SQL is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
SELECT 
    DATEDIFF(CURDATE(), STR_TO_DATE(k.BLDAT, '%Y%m%d')) AS age, 
    CASE 
        WHEN DATEDIFF(CURDATE(), STR_TO_DATE(k.BLDAT, '%Y%m%d')) BETWEEN 0 AND 3 THEN '0-3 Days' 
        WHEN DATEDIFF(CURDATE(), STR_TO_DATE(k.BLDAT, '%Y%m%d')) BETWEEN 4 AND 7 THEN '3-7 Days' 
        ELSE 'Above 7 Days' 
    END AS age_bucket, 
    b.DMBTR AS amount_local, 
    b.HKONT AS gl_account, 
    b.BUKRS AS company_code 
FROM (
    SELECT BELNR, GJAHR, BUKRS, HKONT, DMBTR 
    FROM SAPR3.BSIS 
    WHERE GJAHR = '2025' 
      AND BUKRS = '0100' 
      AND HKONT = '0000895200' 
      AND MANDT = '500' 
      AND BUZEI = '002' 
      AND WAERS = 'INR' 
      AND MONAT = '01' 
      AND BLDAT = '20240401'
) b 
JOIN SAPR3.BKPF k 
  ON b.BELNR = k.BELNR 
 AND b.GJAHR = k.GJAHR 
 AND b.BUKRS = k.BUKRS 
ORDER BY DATEDIFF(CURDATE(), STR_TO_DATE(k.BLDAT, '%Y%m%d'));

View detailed report

4. Performance Analysis

4.1 SQLFlash Analysis

According to the analysis provided by SQLFlash, this rewrite utilizes subquery flattening technology. The core optimization lies in eliminating the nested structure of the outer derived table (alias t). The original query used a two-layer derived table structure: the inner layer calculated the age field, and the outer layer added the CASE expression and field renaming. While this structure is logically clear, it increases the depth and complexity of the SQL parse tree .

The rewritten query flattens the structure into a single layer. It directly calculates age and age_bucket within the SELECT list, eliminating the intermediate layer of the derived table t. This reduces the materialization overhead of intermediate result sets, avoids indirect access through alias references, and decreases the complexity for the query optimizer when handling multi-level nesting .

Although the optimizer ultimately generated the same execution plan, the reduced syntactic complexity lowers the total overhead of SQL parsing, optimization, and execution. This leads to a significant improvement in actual execution time .

3.1 Original Query Plan

  • Execution Cost: 1064.25
  • Execution Time: Average 0.0433 seconds (Minimum 0.0405s, Maximum 0.0462s)
  • Result Rows: 0

3.2 Optimized Query Plan

  • Execution Cost: 1064.25
  • Execution Time: Average 0.0270 seconds (Minimum 0.0255s, Maximum 0.0298s)
  • Result Rows: 0

3.3 Performance Metrics Comparison

好的,这是根据您提供的图片信息翻译并整理的 Markdown 格式表格。

Performance Comparison

MetricOriginal QueryOptimized Query
SQL StructureTwo-layer derived table nestingSingle-layer derived table
Execution Time0.0433s0.0270s
Performance Gain37.57%
Query HierarchyOuter wrapper + inner calculationDirect calculation
Execution Plan Cost1064.251064.25
Rows Scanned (BKPF)10,00010,000
Rows Scanned (BSIS)1 (eq_ref)1 (eq_ref)
Access TypeALL + eq_refALL + eq_ref
Number of Derived Tables2 layers (b and t)1 layer (only b)
Intermediate Result MaterializationRequiredReduced

3.4 Optimization Principle Analysis

The core of the performance improvement lies in “eliminating redundant nesting and reducing engine overhead”. Although the optimizer can sometimes flatten derived tables, excessive nesting still creates additional burden:

  1. Accelerated Parsing and Optimization: Reducing the query hierarchy from three layers to two. The simplified SQL structure significantly reduces the engine’s burden during lexical parsing and logical optimization stages. Actual tests show that even with similar physical execution paths, the simplification of front-end processing reduces total CPU time by 37.57%.
  2. More Efficient Memory Management: Eliminates the temporary materialization requirement for the outer derived table t. This reduces the allocation and transfer of intermediate result sets in memory and shortens the field reference access path (shifting from indirect alias access back to direct base table reference), thereby lowering memory pressure.
  3. Streamlined Execution Path: Avoids unnecessary alias mapping and nested conversions. With the same execution plan (eq_ref access), the simpler SQL syntax directly improves the underlying processing efficiency of the SQL engine, reducing execution time from 43.28ms to 27.02ms.

5. Conclusion

  1. Streamlined Architecture: Resolutely eliminate ineffective nesting used only for renaming or forwarding. Reduce engine parsing overhead and memory materialization costs by flattening the SQL structure.
  2. Index Priority: Ensure direct reference to base tables and maintain field type matching. Combine with composite indexes to avoid function conversions, achieving the leap from “full table scans” to “precise index hits”.
  3. Efficiency Loop: Establish an inspection mechanism centered around EXPLAIN, and utilize automated tools like SQLFlash for assisted rewriting to quickly achieve performance doubling in complex scenarios.

What is SQLFlash?

SQLFlash is your AI-powered SQL Optimization Partner.

Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.

How to use SQLFlash in a database?

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.