Lesson 33 of the SQL Optimization Course: Time-Based Partitioning Strategies | SQLFlash

Introduction

For relational databases, the design of tables and SQL is written are particularly crucial. It wouldn’t be an exaggeration to say that they account for 90% of performance. So this time, specifically targeting these two major knowledge areas, we’ll conduct a detailed analysis for you, peeling back the layers.

This Series uses plain and understandable language and selects a large number of examples to elaborate on the subtleties for you.

🧑‍💻 Target audience:

  • DBA
  • Database developers
  • Students

We will use MySQL as the demonstration database.


Partitioning remains one of the most effective strategies for optimizing MySQL performance, especially when applied to ​time-based fields. This article explores detailed implementations for partitioning by ​year, ​month, and ​day, along with best practices and performance insights.

1. Partitioning by Year​

Partitioning by year is ideal when queries frequently filter data based on specific years. For example:

1
SELECT * FROM ytt_pt1 WHERE log_date >= '2018-01-01' AND log_date < '2019-01-01';

Example Schema:​

1
2
3
4
5
6
7
8
9
CREATE TABLE ytt_pt1 (
    id BIGINT,
    log_date DATE
) PARTITION BY RANGE (YEAR(log_date)) (
    PARTITION p0001 VALUES LESS THAN (2012),
    PARTITION p0002 VALUES LESS THAN (2013),
    ...
    PARTITION p_max VALUES LESS THAN (MAXVALUE)
);

Performance Note:​​ Queries filtering by literal year values (e.g., log_date = ‘2018-01-01’) use partition pruning and execute quickly. However, avoid expressions like YEAR(log_date) as they bypass partitioning optimizations. Use hints or virtual columns instead:

1
SELECT COUNT(*) FROM ytt_pt1 PARTITION (p0008) WHERE YEAR(log_date) = '2018';

2. Partitioning by Month​

Partitioning by month improves granularity for monthly data retrieval. Two approaches exist:

Method 1: List Partitioning by Month

1
2
3
4
5
6
7
8
9
CREATE TABLE ytt_pt1_month1 (
    id BIGINT,
    log_date DATE
) PARTITION BY LIST (MONTH(log_date)) (
    PARTITION p0001 VALUES IN (1),
    PARTITION p0002 VALUES IN (2),
    ...
    PARTITION p0012 VALUES IN (12)
);

While simple, this method may scan multiple partitions for range queries (e.g., log_date IN (‘2020-01-01’, …, ‘2020-01-15’)).

Method 2: Range Partitioning by Year-Month​

Combine year and month into a single dimension for finer control:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
-- Example stored procedure to generate partitions dynamically
DELIMITER $$
CREATE PROCEDURE sp_add_partition_ytt_pt1_month2()
BEGIN
    DECLARE i INT DEFAULT 2010;
    DECLARE v_date DATE;
    WHILE i <= 2020 DO
        SET v_date = CONCAT(i, '-01-01');
        -- Generate partitions for each month (e.g., p2010_01)
        -- ...
        SET i = i + 1;
    END WHILE;
END
$$
DELIMITER ;
CALL sp_add_partition_ytt_pt1_month2();

Benefit:​​ Queries filtering by exact months (e.g., log_date = ‘2020-01-01’) run up to ​10x faster​ than list partitioning.

​## 3. Partitioning by Day​ Ideal for daily data access patterns, such as querying a specific date:

1
SELECT * FROM ytt_pt1_day WHERE log_date = '2020-01-01';

Implementation Tip:​​ Use a stored procedure to automate daily partitions while respecting MySQL’s ​8,192-partition limit. Example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
DELIMITER $$
CREATE PROCEDURE sp_add_partition_ytt_pt1_day(IN start_year YEAR, IN end_year YEAR)
BEGIN
    DECLARE i YEAR DEFAULT start_year;
    DECLARE v_date DATE;
    WHILE i <= end_year DO
        SET v_date = CONCAT(i, '-01-01');
        -- Generate daily partitions for each year
        -- ...
        SET i = i + 1;
    END WHILE;
END
$$
DELIMITER ;
CALL sp_add_partition_ytt_pt1_day(2010, 2020);

Performance:​​ Daily partitions reduce query times to ​0.01 seconds​ for targeted date lookups.

Key Takeaways​

  • Yearly Partitioning: Best for historical data retention and broad annual queries.
  • Monthly/Year-Month Partitioning: Optimizes monthly aggregations and range scans.
  • Daily Partitioning: Prioritize for real-time or near-real-time daily analytics.
  • Avoid partitioning expressions like YEAR(log_date)—use literals or virtual columns instead.

By aligning partition strategies with query patterns, you can significantly enhance MySQL scalability and query efficiency.

👋 See you in the next lesson.

What is SQLFlash?

SQLFlash is your AI-powered SQL Optimization Partner.

Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.

How to use SQLFlash in a database?

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.