For relational databases, the design of tables and SQL is written are particularly crucial. It wouldn’t be an exaggeration to say that they account for 90% of performance. So this time, specifically targeting these two major knowledge areas, we’ll conduct a detailed analysis for you, peeling back the layers.
This Series uses plain and understandable language and selects a large number of examples to elaborate on the subtleties for you.
🧑💻 Target audience:
DBA
Database developers
Students
We will use MySQL as the demonstration database.
1. Databases and Character Sets
1.1 Specifying Character Sets During Database Creation
When creating a database, explicitly specify the character set and collation. This ensures that the character_set_database and collation_database parameters are set to the specified values. For example, when creating the database ytt_new2 with the character set latin1 and collation latin1_bin, the corresponding system parameters are updated upon switching to this database:
1.2 Modifying Character Sets During Database Alteration
Modifying the character set and collation of a database is similar to specifying them during creation. However, it’s important to note that existing database objects (such as tables, views, and stored procedures) will retain their original character set and collation settings.
When you check the character set of this stored procedure, you’ll see that the collation is latin1_bin and the corresponding character set is latin1, consistent with the database ytt_new2.
mysql>showcreateproceduresp_demo\G***************************1.row***************************Procedure:sp_demo...DatabaseCollation:latin1_bin1rowinset(0.00sec)-- Then, change the database character set to UTF8
mysql>alterdatabaseytt_new2charactersetutf8collateutf8_general_ci;QueryOK,1rowaffected,2warning(0.02sec)mysql>select@@character_set_database,@@collation_database;+--------------------------+----------------------+
|@@character_set_database|@@collation_database|+--------------------------+----------------------+
|utf8|utf8_general_ci|+--------------------------+----------------------+
1rowinset(0.00sec)-- Check the character set of the stored procedure sp_demo again, and it remains the same.
mysql>showcreateproceduresp_demo\G***************************1.row***************************Procedure:sp_demo...DatabaseCollation:latin1_bin1rowinset(0.00sec)-- At this point, calling the stored procedure results in a character set mismatch error.
mysql>callsp_demo('A','I');ERROR1366(HY000):Incorrectstringvalue:'\xE6\x88\x91'forcolumn'f1'atrow1-- To change the character set of the stored procedure, you need to drop and recreate it. After re-executing the stored procedure code, the call works correctly.
mysql>callsp_demo('A','I');+--------+
|result|+--------+
|AI|+--------+
1rowinset(0.00sec)QueryOK,0rowsaffected,2warnings(0.01sec)
1.3 Default Inheritance of Parameters
If the character_set_database and collation_database parameters are not explicitly specified, they will default to the server-level settings of character_set_server and collation_server, respectively.
In this case, when creating or altering a database without specifying a specific character set and collation, these parameters will inherit the server-level settings of character_set_server and collation_server by default.
1
2
3
4
5
mysql>showcreatedatabaseytt_new3\G***************************1.row***************************Database:ytt_new3CreateDatabase:CREATEDATABASE`ytt_new3`/*!40100 DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci *//*!80016 DEFAULT ENCRYPTION='N' */1rowinset(0.00sec)
2. Tables and Character Sets
2.1 Specifying Character Sets During Table Creation (Explicit Setting)
Similar to creating a database, explicitly specifying the character set and collation when creating a table has the highest priority and will be used as the definitive settings.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
-- create new database ytt_new4
mysql>createdatabaseytt_new4;QueryOK,1rowaffected(0.02sec)mysql>useytt_new4;Databasechanged-- create new table t1
mysql>createtablet1(a1int)charsetlatin1collatelatin1_bin;QueryOK,0rowsaffected(0.05sec)mysql>showcreatetablet1\G***************************1.row***************************Table:t1CreateTable:CREATETABLE`t1`(`a1`intDEFAULTNULL)ENGINE=InnoDBDEFAULTCHARSET=latin1COLLATE=latin1_bin1rowinset(0.00sec)
2.2 Inheriting Settings (Implicit Conversion)
By default, tables inherit the character set and collation from the database they belong to, not the current database. It is important to note that this is based on the database where the table resides, not the database currently in use.
A view is essentially a virtual table, and its character set is determined by the client’s character set settings at the time of creation. This means that the character set of a view is entirely dependent on the client connection settings when the view is defined.
Triggers are similar to views in that their character set is also determined by the client settings at the time of creation. This means that the character set of a trigger will match the client’s character set when it is defined.
Like databases and tables, columns can also explicitly specify a particular character set and collation. Although this is possible, it is highly discouraged. The main reason is that having different character sets for each column can complicate data insertion and retrieval, requiring additional SQL statements.
To insert a simple record next, you need to add an introducer for each column; otherwise, an error will occur.
The most troublesome part is querying, as the result of each query can only have one character set. Columns with compatible character sets can be retrieved together, while those with incompatible character sets must be retrieved separately. For example, I wanted to simply use SELECT * to retrieve all records, but found that one of the fields displayed incorrectly.
However, there is an important point to note: if you change the character set of the table at this time, the columns of the table will still retain their original character set.
In summary, this article has detailed the considerations for handling character sets in MySQL databases, tables, and columns, with examples provided. I hope this information is helpful.
SQLFlash is your AI-powered SQL Optimization Partner.
Based on AI models, we accurately identify SQL performance bottlenecks and optimize query performance, freeing you from the cumbersome SQL tuning process so you can fully focus on developing and implementing business logic.