Lesson 10 of the SQL Optimization Course: How to Use Compressed Tables in MySQL? | SQLFlash

Introduction

For relational databases, the design of tables and SQL is written are particularly crucial. It wouldn’t be an exaggeration to say that they account for 90% of performance. So this time, specifically targeting these two major knowledge areas, we’ll conduct a detailed analysis for you, peeling back the layers.

This Series uses plain and understandable language and selects a large number of examples to elaborate on the subtleties for you.

πŸ§‘β€πŸ’» Target audience:

  • DBA
  • Database developers
  • Students

We will use MySQL as the demonstration database.

I. Concept

A compressed table, as the name suggests, is simply a table that has been compressed. It is generated by compressing the original table using a specific compression algorithm and a certain compression ratio.

1.1 Products with Strong Compression Capabilities

When it comes to compressed tables, which are essentially tables reduced in size through compression algorithms, certain database products stand out. These include Greenplum (based on PostgreSQL), the early columnar database Inforbright (based on MySQL), and Percona’s TokuDB, all known for their robust compression abilities.

1.2 Why Use Compressed Tables?

Scenario 1: Imagine a 1TB disk that can only hold 10 tables of 100GB each. If these tables are compressed to 10GB each, the same disk can store 100 tables, increasing capacity tenfold.

Scenario 2: MySQL’s default page size is 16K, while most OS file systems use 4K blocks. During dirty page writes, incomplete writes can corrupt data. For example, if a 16K page only writes 12K, the remaining 4K failure leads to data corruption. Even Redo Log recovery would be flawed as it records data page offsets. MySQL uses a DOUBLE WRITE BUFFER to prevent this, but with compressed tables matching the OS block size (e.g., 4K), this buffer becomes unnecessary, eliminating associated overhead.

To check your file system’s block size:

1
2
root@ytt-pc:/home/ytt# tune2fs -l /dev/mapper/ytt--pc--vg-root | grep -i 'block size'
Block size:               4096

1.3 Advantages of Compressed Tables

The primary benefit is reduced disk space usage. Smaller tables mean less resource consumption when data is moved from disk to memory and during network transmission. In short: reduced disk I/O and network I/O.

1.4 Drawbacks of Compressed Tables

Write operations (INSERT, UPDATE, DELETE) on compressed tables consume more CPU resources. This is because writing involves decompressing data, updating it, and recompressing itβ€”two extra steps compared to regular tables. Choosing an optimized compression algorithm is crucial.

1.5 MySQL Supported Compression Algorithms

These algorithms are fundamental not just for compressed tables but also for other compression needs in MySQL, such as client-server data compression, replication compression, and database operation cloning. MySQL supports zlib and zstd, with zlib as the default. You can opt for zstd as well.

1
2
3
4
5
6
7
mysql> select @@protocol_compression_algorithms;
+-----------------------------------+
| @@protocol_compression_algorithms |
+-----------------------------------+
| zlib,zstd,uncompressed            |
+-----------------------------------+
1 row in set (0.00 sec)

To check the zlib version in MySQL:

1
2
3
4
5
6
7
mysql> select @@version_compile_zlib;
+------------------------+
| @@version_compile_zlib |
+------------------------+
| 1.2.11                 |
+------------------------+
1 row in set (0.00 sec)

2. How to Use Compressed Tables in MySQL Storage Engines

MySQL’s single-instance supports compressed tables via two engines: MyISAM and InnoDB.

2.1 MyISAM Engine: Compression at the Table Field Level

Creating a MyISAM table is straightforward. For example:

1
2
mysql> create table n1(id int,r1 text,r2 text,key idx_id(id),key idx_r1(r1(10))) engine myisam;
Query OK, 0 rows affected (0.01 sec)
After inserting 100,000 records, the data size is 116M and the index size is 1.4M.

1
2
3
root@ytt-pc:/var/lib/mysql/3304/ytt# ls -sihl n1.{MYD,MYI}
3539537 116M -rw-r----- 1 mysql mysql 116M 3月  31 11:46 n1.MYD
3539536 1.4M -rw-r----- 1 mysql mysql 1.4M 3月  31 11:48 n1.MYI

MySQL’s myisampack utility compresses MyISAM table data (not indexes):

1
2
3
4
root@ytt-pc:/var/lib/mysql/3304/ytt# myisampack n1 -v
Compressing n1.MYD: (100000 records)
- Calculating statistics
...

After compression, rebuild the indexes:

1
2
3
4
5
6
root@ytt-pc:/var/lib/mysql/3304/ytt# myisamchk -rq n1
- check record delete-chain
- recovering (with sort) MyISAM-table 'n1'
Data records: 100000
- Fixing index 1
- Fixing index 2

The compressed data size is 31M, a 4x reduction from the original 116M, while the index remains 1.4M.

1
2
3
root@ytt-pc:/var/lib/mysql/3304/ytt# ls -sihl n1.{MYD,MYI}
3539542 31M -rw-r----- 1 mysql mysql 31M 3月 31 11:46 n1.MYD
3539536 1.4M -rw-r----- 1 mysql mysql 1.4M 3月 31 11:48 n1.MYI

MyISAM compressed tables are ideal for read-only scenarios!

2.2 InnoDB Engine: Compression at the Page Level

InnoDB compresses both data and indexes at the page level. Page sizes are 1K, 2K, 4K, 8K, 16K, 32K, 64K (default 16K; 32K and 64K don’t support compression). Compression is set via key_block_size, with row_format=compressed implying key_block_size=8.

Creating a table with default 8K page size:

1
2
mysql> create table t1(id int primary key, r1 varchar(200),r2 text);
Query OK, 0 rows affected (0.07 sec)

After inserting 10,000 records, the data file is 22M.

1
2
3
root@ytt-pc:/var/lib/mysql/3304/ytt# ls -sihl
Total 22M
3539514 22M -rw-r----- 1 mysql mysql 21M 3月 30 22:26 t1.ibd
1
2
3
mysql> alter table t1 row_format=compressed;
Query OK, 0 rows affected (3.99 sec)
Records: 0 Duplicates: 0 Warnings: 0

The data file size reduces to 10M, a 50% compression rate.

1
2
3
root@ytt-pc:/var/lib/mysql/3304/ytt# ls -sihl
Total 11M
3539513 11M -rw-r----- 1 mysql mysql 10M 3月 30 22:27 t1.ibd

Creating a table with 4K page size:

1
2
mysql> create table t2(id int primary key, r1 varchar(200),r2 text) key_block_size=4;
Query OK, 0 rows affected (0.07 sec)

General Tablespaces only support compression tables matching the tablespace file block size. For example, a tablespace with 4K block size requires key_block_size=4.

1
2
3
4
5
6
7
mysql> create tablespace ytt_ts1 add datafile 'ytt_ts1.ibd' file_block_size=4K;
Query OK, 0 rows affected (0.05 sec)
mysql> create table t3 like t2;
Query OK, 0 rows affected (0.06 sec)
mysql> alter table t3 tablespace ytt_ts1;
Query OK, 0 rows affected (0.09 sec)
Records: 0 Duplicates: 0 Warnings: 0

Attempting to use a mismatched page size (8K) with a 4K tablespace results in an error:

1
2
3
4
5
6
7
8
9
mysql> create table t4 like t2;
Query OK, 0 rows affected (0.05 sec)
mysql> alter table t4 key_block_size=8;
Query OK, 0 rows affected (0.15 sec)
Records: 0 Duplicates: 0 Warnings: 0
mysql> alter table t4 tablespace ytt_ts1;
ERROR 1478 (HY000): InnoDB: Tablespace `ytt_ts1` uses block size 4096 and cannot contain a table with physical page size 8192
mysql> drop table t4;
Query OK, 0 rows affected (0.05 sec)

3. Impact of Compressed Tables on B-tree Pages and InnoDB Buffer Pool

3.1 Impact on B-tree Pages

  1. Each compressed B-tree page must contain at least one record, more flexible than regular pages which require at least two.
  2. Modification Log: InnoDB uses a 16K modification log within each compressed page to handle writes efficiently. This log reserves space for updates, avoiding frequent decompression, updates, and recompression. When the log fills (controlled by innodb_compression_pad_pct_max, default 50%), the page is recompressed. If recompression fails, page splitting and merging occur.

3.2 Impact on InnoDB Buffer Pool

Compressed pages in the Buffer Pool coexist with uncompressed pages. When a record is read from a compressed table, both the compressed page (1K, 2K, 4K, 8K) and the uncompressed page (16K) are loaded into the Buffer Pool to minimize unnecessary decompressions. If the Buffer Pool is full, the original page is evicted, retaining the compressed page. In extreme cases, both may be evicted.

Important Notes:

  1. System tablespace doesn’t support compressed tables.
  2. General tablespace can’t mix compressed and uncompressed tables.
  3. row_format=compressed applies to table and index compression, not row-level.
  4. Temporary tables don’t support compression.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
mysql> create temporary table tmp_t1(id int,r1 text,r2 text) row_format=compressed;
ERROR 3500 (HY000): CREATE TEMPORARY TABLE is not allowed with ROW_FORMAT=COMPRESSED or KEY_BLOCK_SIZE.
mysql> show errors\G
*************************** 1. row ***************************
Level: Note
ErrorCode: 3500
Message: CREATE TEMPORARY TABLE is not allowed with ROW_FORMAT=COMPRESSED or KEY_BLOCK_SIZE.
*************************** 2. row ***************************
Level: Note
ErrorCode: 1031
Message: Table storage engine for 'tmp_t1' doesn't have this option
2 rows in set (0.00 sec)

This article covers MySQL compressed tables’ concepts, pros and cons, and their impact on index trees, with examples for MyISAM and InnoDB engines. The next chapter will delve deeper into compressed table monitoring.

πŸ‘‹ See you in the next lesson.

What is SQLFlash?

SQLFlash is your AI-powered SQL Optimization Partner.

Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.

How to use SQLFlash in a database?

Ready to elevate your SQL performance?

Join us and experience the power of SQLFlash today!.