Lesson 13 of the SQL Optimization Course: Data Page Merging | SQLFlash

Introduction

For relational databases, the design of tables and SQL is written are particularly crucial. It wouldn’t be an exaggeration to say that they account for 90% of performance. So this time, specifically targeting these two major knowledge areas, we’ll conduct a detailed analysis for you, peeling back the layers.

This Series uses plain and understandable language and selects a large number of examples to elaborate on the subtleties for you.

🧑‍💻 Target audience:

  • DBA
  • Database developers
  • Students

We will use MySQL as the demonstration database.

The merging and splitting of data pages or secondary index pages (referred to as data pages or index pages) in MySQL InnoDB tables significantly impact overall table performance; the more these operations occur, the greater the impact on data writes.

MySQL provides a data page merge threshold (MERGE_THRESHOLD), which can be manually adjusted in certain scenarios to reduce page merging and splitting.

In InnoDB tables, each data page is 16K by default. The default MERGE_THRESHOLD is 50, with a range of 1 to 50, where the default is also the maximum value. When the record occupancy of a page falls below 50%, MySQL merges the page with adjacent pages to maintain compactness and avoid waste.

1. Trigger Scenarios for the Threshold

Scenario 1:
Page A originally has 100% data occupancy. After some records are deleted, occupancy drops below 50%, triggering the threshold.

Scenario 2:
Page B contains records updated to a shorter form (e.g., a value changes from rpad('I love everyone', 10000, 'add redundant characters') to 'I only love you'). This reduction in record size causes page occupancy to drop below 50%, triggering the threshold.

2. Data Page Merging Explained

After deleting some records, if the remaining records in Page A occupy less than the MERGE_THRESHOLD, and an adjacent page (Page C) also has less than 50% occupancy, MySQL merges Page C’s records into Page A. Page C’s space is then freed up for future inserts, avoiding waste.

3. Data Page Splitting Explained

Consider Pages D and E, each with 49% record occupancy. After merging, Page D’s occupancy becomes 98%, leaving only 2% free space. Similarly, Pages F and H, each at 49% occupancy, merge to 98% on Page F.

When a new insert request comes in with a primary key between Pages D and F, and neither has enough space (only 2% left), Page D must split. A new page (Page I) is created. The original records from Page D and the new insert are sorted and redistributed: Page D is refilled, and the remaining data goes to Page I. Frequent page splitting involves significant overhead due to data migration and new page creation.

4. Practical Use and Monitoring of MERGE_THRESHOLD

Note: MERGE_THRESHOLD must be in uppercase! MySQL treats lowercase as a simple comment.

4.1 Setting MERGE_THRESHOLD for a Table

To set MERGE_THRESHOLD for an entire table, include the value in the table comment:

1
2
mysql> CREATE TABLE sample1(id INT PRIMARY KEY, r1 INT, r2 VARCHAR(1000)) COMMENT 'MERGE_THRESHOLD=40';
Query OK, 0 rows affected (0.08 sec)

Or modify an existing table:

1
2
3
mysql> ALTER TABLE t1 COMMENT 'MERGE_THRESHOLD=40';
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

4.2 Setting MERGE_THRESHOLD for an Index

You can also set MERGE_THRESHOLD for individual index columns. The index-level setting takes precedence over the table-level setting:

1
2
mysql> CREATE TABLE t1(id INT, KEY idx_id(id) COMMENT 'MERGE_THRESHOLD=40');
Query OK, 0 rows affected (0.08 sec)

Or modify an existing index:

1
2
3
mysql> ALTER TABLE t1 DROP KEY idx_id, ADD KEY idx_id(id) COMMENT 'MERGE_THRESHOLD=40';
Query OK, 0 rows affected (0.02 sec)
Records: 0  Duplicates: 0  Warnings: 0

Or create a new index with the threshold:

1
2
3
mysql> CREATE INDEX idx_id ON t1(id) COMMENT 'MERGE_THRESHOLD=40';
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0

4.3 Viewing MERGE_THRESHOLD Settings

Check the table definition:

1
2
3
4
mysql> SHOW CREATE TABLE sample1\G
...
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci COMMENT='MERGE_THRESHOLD=40'
1 row in set (0.00 sec)

Check the index definition:

1
2
3
4
5
mysql> SHOW INDEX FROM t1\G
...
     Comment:
Index_comment: MERGE_THRESHOLD=40
1 row in set (0.00 sec)

Or query the data dictionary table:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
mysql> SELECT a.name AS tablename, b.name AS index_name, b.MERGE_THRESHOLD
    FROM innodb_tables AS a, innodb_indexes AS b
    WHERE a.table_id = b.table_id AND a.name LIKE 'ytt%';
+-------------+-----------------+-----------------+
| tablename   | index_name      | MERGE_THRESHOLD |
+-------------+-----------------+-----------------+
| ytt/sample1 | PRIMARY         |              40 |
| ytt/t1      | GEN_CLUST_INDEX |              50 |
| ytt/t1      | idx_id          |              40 |
+-------------+-----------------+-----------------+
3 rows in set (0.00 sec)

4.4 Evaluating the Effect of MERGE_THRESHOLD Settings

The innodb_metrics table provides two counters to track page merging. These counters are disabled by default and need to be explicitly enabled.

1
2
3
4
5
6
7
8
9
mysql> SELECT NAME, COMMENT FROM INFORMATION_SCHEMA.INNODB_METRICS
    -> WHERE NAME LIKE '%index_page_merge%';
+-----------------------------+----------------------------------------+
| NAME                        | COMMENT                                |
+-----------------------------+----------------------------------------+
| index_page_merge_attempts   | Number of index page merge attempts    |
| index_page_merge_successful | Number of successful index page merges |
+-----------------------------+----------------------------------------+
2 rows in set (0.00 sec)

Enable these counters:

1
2
3
4
mysql> SET GLOBAL innodb_monitor_enable='index_page_merge_attempts';
Query OK, 0 rows affected (0.00 sec)
mysql> SET GLOBAL innodb_monitor_enable='index_page_merge_successful';
Query OK, 0 rows affected (0.00 sec)

Create two tables with MERGE_THRESHOLD set to default and 20, respectively, and import 10,000 records to compare page merging.

1
2
3
4
mysql> CREATE TABLE t1_max(id INT PRIMARY KEY, r1 INT, KEY idx_r1 (r1));
Query OK, 0 rows affected (0.08 sec)
mysql> CREATE TABLE t1_min(id INT, PRIMARY KEY (id) COMMENT 'MERGE_THRESHOLD=20');
Query OK, 0 rows affected (0.08 sec)

Before comparison, reset the counters:

1
2
3
4
5
6
7
8
mysql> SET GLOBAL innodb_monitor_disable='index_page_merge_attempts';
Query OK, 0 rows affected (0.00 sec)
mysql> SET GLOBAL innodb_monitor_disable='index_page_merge_successful';
Query OK, 0 rows affected (0.00 sec)
mysql> SET GLOBAL innodb_monitor_reset_all;
Query OK, 0 rows affected (0.00 sec)
mysql> SET GLOBAL innodb_monitor_enable='index_page_merge_attempts', 'index_page_merge_successful';
Query OK, 0 rows affected (0.00 sec)

Insert 1,000 random records into t1_max:

1
2
3
4
5
6
7
8
9
mysql> REPLACE INTO t1_max SELECT CEIL(RAND()*1000), CEIL(RAND()*100);
...
mysql> SELECT COUNT(*) FROM t1_max;
+----------+
| COUNT(*) |
+----------+
|     1000 |
+----------+
1 row in set (0.03 sec)

Delete 500 records:

1
2
mysql> DELETE FROM t1_max LIMIT 500;
Query OK, 500 rows affected (0.05 sec)

Check the counter results: 707 merge attempts, 20 successful merges.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
mysql> SELECT NAME, COUNT, MAX_COUNT, AVG_COUNT
    FROM INFORMATION_SCHEMA.INNODB_METRICS
    WHERE NAME LIKE '%index_page_merge%';
+-----------------------------+-------+-----------+--------------------+
| NAME                        | COUNT | MAX_COUNT | AVG_COUNT          |
+-----------------------------+-------+-----------+--------------------+
| index_page_merge_attempts   |   707 |       707 |  0.250000000000000 |
| index_page_merge_successful |    20 |        20 |  0.072992700729927 |
+-----------------------------+-------+-----------+--------------------+
2 rows in set (0.00 sec)

Reset counters and perform the same operations on t1_min:

1
2
3
4
5
6
7
8
9
mysql> SELECT COUNT(*) FROM t1_min;
+----------+
| COUNT(*) |
+----------+
|      500 |
+----------+
1 row in set (0.02 sec)
mysql> DELETE FROM t1_min LIMIT 500;
Query OK, 500 rows affected (0.02 sec)

Check the counter results: 30 merge attempts, only 2 successful merges.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
mysql> SELECT NAME, COUNT, MAX_COUNT, AVG_COUNT
    FROM INFORMATION_SCHEMA.INNODB_METRICS
    WHERE NAME LIKE '%index_page_merge%';
+-----------------------------+-------+-----------+---------------------+
| NAME                        | COUNT | MAX_COUNT | AVG_COUNT           |
+-----------------------------+-------+-----------+---------------------+
| index_page_merge_attempts   |    30 |        30 |  0.100000000000000  |
| index_page_merge_successful |     2 |         2 | 0.03333333333333333 |
+-----------------------------+-------+-----------+---------------------+
2 rows in set (0.00 sec)

When setting MERGE_THRESHOLD to the minimum value of 1, almost no merging occurs:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
mysql> SELECT NAME, COUNT, MAX_COUNT, AVG_COUNT
    FROM INFORMATION_SCHEMA.INNODB_METRICS
    WHERE NAME LIKE '%index_page_merge%';
+-----------------------------+-------+-----------+-----------+
| NAME                        | COUNT | MAX_COUNT | AVG_COUNT |
+-----------------------------+-------+-----------+-----------+
| index_page_merge_attempts   |     0 |      NULL |         0 |
| index_page_merge_successful |     0 |      NULL |         0 |
+-----------------------------+-------+-----------+-----------+
2 rows in set (0.00 sec)

Summary

This article introduces the concept of the MySQL index page merge threshold and evaluates its impact on index page merging and splitting in practical environments. Feedback and suggestions are welcome.

👋 See you in the next lesson.

What is SQLFlash?

SQLFlash is your AI-powered SQL Optimization Partner.

Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.

How to use SQLFlash in a database?

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.