Lesson 30 of the SQL Optimization Course: Optimizing MySQL Performance with Table Denormalization Strategies | SQLFlash

Introduction

For relational databases, the design of tables and SQL is written are particularly crucial. It wouldn’t be an exaggeration to say that they account for 90% of performance. So this time, specifically targeting these two major knowledge areas, we’ll conduct a detailed analysis for you, peeling back the layers.

This Series uses plain and understandable language and selects a large number of examples to elaborate on the subtleties for you.

🧑‍💻 Target audience:

  • DBA
  • Database developers
  • Students

We will use MySQL as the demonstration database.


In the previous article, we explored normalized database design in MySQL. While normalization eliminates redundancy and enhances data integrity, it introduces challenges like increased table joins and reduced query efficiency. This article introduces denormalization strategies to balance these trade-offs.

Understanding Denormalization in MySQL

Normalization reduces redundancy but increases table count (e.g., 3NF may require more tables than 2NF). Excessive joins degrade performance, especially for complex queries. Denormalization intentionally introduces controlled redundancy to simplify queries and improve speed. Examples include:

  • Adding redundant fields to frequently accessed tables
  • Using JSON columns for infrequently modified data
  • Precomputing aggregated values

Practical Denormalization Examples

Scenario: Employee Management System

Consider five normalized tables: employees, departments, salaries, employee_salary_links, and employee_dept_links. Four common query patterns were identified:

  1. Retrieve department managers by name
  2. Find the first employee in each department
  3. Identify the highest-paid employee in a department
  4. Calculate gender distribution per department

Each query required multiple joins, leading to suboptimal performance.

Implementing Denormalization

To optimize these queries, we added redundant fields to the employees table:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
-- Add department name and manager flag
ALTER TABLE employees 
ADD dept_name VARCHAR(64),
ADD is_manager CHAR(1);

-- Populate redundant data
UPDATE employees e
JOIN employee_dept_links ed ON e.employee_number = ed.employee_number
JOIN departments d ON ed.dept_id = d.dept_id
SET e.dept_name = d.dept_name,
    e.is_manager = ed.is_manager;

-- Optimize queries with indexed redundant fields
CREATE INDEX idx_is_manager ON employees(is_manager);
CREATE INDEX idx_salary_level ON employees(salary_level);

Query Optimization Results

After denormalization:

  1. Manager lookup: Simplified to a single-table query
    1
    2
    3
    
    SELECT employee_name, dept_name 
       FROM employees 
       WHERE is_manager = 'Yes';
    
  2. First employee per department: Achieved via window functions
    1
    2
    3
    4
    5
    6
    
    SELECT employee_name, dept_name 
       FROM (
           SELECT *, ROW_NUMBER() OVER(PARTITION BY dept_name ORDER BY employee_number) AS rn 
           FROM employees
       ) t 
       WHERE rn = 1;
    
  3. Highest-paid employee: Reduced to a scalar subquery
    1
    2
    3
    
    SELECT employee_name, dept_name 
       FROM employees 
       WHERE salary_level = (SELECT MAX(salary_level) FROM salaries);
    
  4. Gender distribution: Eliminated joins with precomputed counts

Best Practices for Denormalization

  • Use sparingly: Target high-impact, frequently executed queries
  • Maintain data integrity: Use triggers or application logic for updates
  • Index strategically: Add indexes on redundant fields used in WHERE clauses
  • Monitor performance: Regularly analyze query execution plans

Conclusion

While normalization remains essential for data integrity, strategic denormalization can significantly improve query efficiency. By balancing normalization principles with denormalization techniques, developers can achieve optimal performance for real-world applications.

👋 See you in the next lesson.

What is SQLFlash?

SQLFlash is your AI-powered SQL Optimization Partner.

Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.

How to use SQLFlash in a database?

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.