MySQL 字符集utf8、utf8mb3、utf8mb4


首先想要了解MySQL的字符集,就需要去官方文档看看字符集是如何介绍的。英语不错的话,看官方文档应该是没问题。在搜索框里搜一下就可以找到相关的解释。我就在这里整理一下,以便后期查看。字符集在官方文档下面这一章节:
Chapter 10 Character Sets, Collations, Unicode

https://dev.mysql.com/doc/refman/5.6/en/charset.html

String expressions have a repertoire attribute, which can have two values:

ASCII: The expression can contain only characters in the Unicode rangeU+0000toU+007F.

UNICODE: The expression can contain characters in the Unicode rangeU+0000toU+10FFFF. This includes characters in the Basic Multilingual Plane (BMP) range (U+0000toU+FFFF) and supplementary characters outside the BMP range (U+10000toU+10FFFF).

The keywordSCHEMAcan be used instead ofDATABASE.

All database options are stored in a text file nameddb.optthat can be found in the database directory.

TheCHARACTER SETandCOLLATEclauses make it possible to create databases with different character sets and collations on the same MySQL server.

TheCREATE TABLEandALTER TABLEstatements have optional clauses for specifying the table character set and collation:
CREATETABLEtbl_name(column_list)[[DEFAULT]CHARACTERSETcharset_name][COLLATEcollation_name]]
ALTERTABLEtbl_name[[DEFAULT]CHARACTERSETcharset_name][COLLATEcollation_name]

Theutf8mb3andutf8mb4character sets differ as follows:

utf8mb3supports only characters in the Basic Multilingual Plane (BMP).utf8mb4additionally supports supplementary characters that lie outside the BMP.

utf8mb3uses a maximum of three bytes per character.utf开发云主机域名8mb4uses a maximum of four bytes per character.

This discussion refers to theutf8mb3andutf8mb4character set names to be explicit about referring to 3-byte and 4-byte UTF-8 character set data. The exception is that in table definitions,utf8is used because MySQL converts instances ofutf8mb3specified in such definitions toutf8, which is an alias forutf8mb3.

utf8mb4与utf8(utf8mb3)转换也是特别好转换的:

1.utf8(utf8mb3)转成utf8mb4可以存储supplementary characters;
2.utf8(utf8mb3)转成utf8mb4可能会增加数据存储空间;
3.对于BMP character字符,utf8(utf8mb3)转成utf8mb4相同的代码值、相同的编码、相同的长度,不会有变化。
4.对于supplementary character字符,utf8mb4会以4字节存储,由于utf8mb3无法存储supplementary character字符,因而在字符集转换过程中,不用担心字符无法转换的问题。
5.表结构在转换过程中需要调整:utf8(utf8mb3)字符集可变长度字符数据类型(VARCHAR和text类型)设定的表中列的字段长度,utf8mb4中将会存储更少的字符。对于所有字符数据类型(CHAR、VARCHAR和文本类型),UTF8Mb4列最多可被索引的字符数比UTF8Mb3列要少。因此在转换之前,要检查字段类型。防止转换后表,索引存储的数据超出该字段定义长度,字段类型长度可以存储的最大字节数。innodb索引列:最大索引列长度767 bytes,对于utf8mb3就是可以索引255个字符,对于utf8mb4就是可以索引191个字符。在转换后不能满足那么就需要换一个列来索引。以下是通过压缩方式使索引更多的字节。

ForInnoDBtables that useCOMPRESSEDorDYNAMICrow format, you can enable theinnodb_large_prefixoption to permitindex key prefixeslonger than 767 bytes (up to 3072 bytes). Creating such tables also requires the option valuesinnodb_file_format=barracudaandinnodb_file_per_table=true.) In this case, enabling theinnodb_large_prefixoption enables you to index a maximum of 1024 or 768 characters forutf8mb3orutf8mb4columns, respectively. For related information, seeSection14.8.1.7, “Limits on InnoDB Tables”.

相关推荐: my.cnf常用配置

自动生成cnf配置文件:http://imysql.com/my-cnf-wizard.html常用配置:[mysql]字段[mysqld]字段1.使用innodb注意事项2.查询缓存相关3.系统资源相关4.二进制日志相关5.附配置主从相关操作:准备工作:确认…

免责声明:本站发布的图片视频文字,以转载和分享为主,文章观点不代表本站立场,本站不承担相关法律责任;如果涉及侵权请联系邮箱:360163164@qq.com举报,并提供相关证据,经查实将立刻删除涉嫌侵权内容。

(0)
打赏 微信扫一扫 微信扫一扫
上一篇 06/04 21:51
下一篇 06/04 21:51

相关推荐