Lesson 10 of the SQL Optimization Course: How to Use Compressed Tables in MySQL?

For relational databases, the design of tables and SQL is written are particularly crucial. It wouldn’t be an exaggeration to say that they account for 90% of performance. So this time, specifically targeting these two major knowledge areas, we’ll conduct a detailed analysis for you, peeling back the layers.
This Series uses plain and understandable language and selects a large number of examples to elaborate on the subtleties for you.
π§βπ» Target audience:
We will use MySQL as the demonstration database.
A compressed table, as the name suggests, is simply a table that has been compressed. It is generated by compressing the original table using a specific compression algorithm and a certain compression ratio.
When it comes to compressed tables, which are essentially tables reduced in size through compression algorithms, certain database products stand out. These include Greenplum (based on PostgreSQL), the early columnar database Inforbright (based on MySQL), and Percona’s TokuDB, all known for their robust compression abilities.
Scenario 1: Imagine a 1TB disk that can only hold 10 tables of 100GB each. If these tables are compressed to 10GB each, the same disk can store 100 tables, increasing capacity tenfold.
Scenario 2: MySQL’s default page size is 16K, while most OS file systems use 4K blocks. During dirty page writes, incomplete writes can corrupt data. For example, if a 16K page only writes 12K, the remaining 4K failure leads to data corruption. Even Redo Log recovery would be flawed as it records data page offsets. MySQL uses a DOUBLE WRITE BUFFER to prevent this, but with compressed tables matching the OS block size (e.g., 4K), this buffer becomes unnecessary, eliminating associated overhead.
To check your file system’s block size:
|
|
The primary benefit is reduced disk space usage. Smaller tables mean less resource consumption when data is moved from disk to memory and during network transmission. In short: reduced disk I/O and network I/O.
Write operations (INSERT, UPDATE, DELETE) on compressed tables consume more CPU resources. This is because writing involves decompressing data, updating it, and recompressing itβtwo extra steps compared to regular tables. Choosing an optimized compression algorithm is crucial.
These algorithms are fundamental not just for compressed tables but also for other compression needs in MySQL, such as client-server data compression, replication compression, and database operation cloning. MySQL supports zlib and zstd, with zlib as the default. You can opt for zstd as well.
|
|
To check the zlib version in MySQL:
|
|
MySQL’s single-instance supports compressed tables via two engines: MyISAM and InnoDB.
Creating a MyISAM table is straightforward. For example:
|
|
|
|
MySQL’s myisampack utility compresses MyISAM table data (not indexes):
|
|
After compression, rebuild the indexes:
|
|
The compressed data size is 31M, a 4x reduction from the original 116M, while the index remains 1.4M.
|
|
MyISAM compressed tables are ideal for read-only scenarios!
InnoDB compresses both data and indexes at the page level. Page sizes are 1K, 2K, 4K, 8K, 16K, 32K, 64K (default 16K; 32K and 64K don’t support compression). Compression is set via key_block_size, with row_format=compressed implying key_block_size=8.
Creating a table with default 8K page size:
|
|
After inserting 10,000 records, the data file is 22M.
|
|
|
|
The data file size reduces to 10M, a 50% compression rate.
|
|
Creating a table with 4K page size:
|
|
General Tablespaces only support compression tables matching the tablespace file block size. For example, a tablespace with 4K block size requires key_block_size=4.
|
|
Attempting to use a mismatched page size (8K) with a 4K tablespace results in an error:
|
|
Compressed pages in the Buffer Pool coexist with uncompressed pages. When a record is read from a compressed table, both the compressed page (1K, 2K, 4K, 8K) and the uncompressed page (16K) are loaded into the Buffer Pool to minimize unnecessary decompressions. If the Buffer Pool is full, the original page is evicted, retaining the compressed page. In extreme cases, both may be evicted.
Important Notes:
|
|
This article covers MySQL compressed tables’ concepts, pros and cons, and their impact on index trees, with examples for MyISAM and InnoDB engines. The next chapter will delve deeper into compressed table monitoring.
π See you in the next lesson.
SQLFlash is your AI-powered SQL Optimization Partner.
Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.
Join us and experience the power of SQLFlash today!.