(mysql.info) charset-mysql
Info Catalog
(mysql.info) charset-general
(mysql.info) charset
(mysql.info) charset-syntax
10.2 Character Sets and Collations in MySQL
===========================================
The MySQL server can support multiple character sets. To list the
available character sets, use the `SHOW CHARACTER SET' statement. A
partial listing follows. For more complete information, see
charset-charsets.
mysql> SHOW CHARACTER SET;
+----------+-----------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+-----------------------------+---------------------+--------+
| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
| dec8 | DEC West European | dec8_swedish_ci | 1 |
| cp850 | DOS West European | cp850_general_ci | 1 |
| hp8 | HP West European | hp8_english_ci | 1 |
| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
| ascii | US ASCII | ascii_general_ci | 1 |
| ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |
| sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 |
| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |
| tis620 | TIS620 Thai | tis620_thai_ci | 1 |
| euckr | EUC-KR Korean | euckr_korean_ci | 2 |
| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |
| gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 |
| greek | ISO 8859-7 Greek | greek_general_ci | 1 |
| cp1250 | Windows Central European | cp1250_general_ci | 1 |
| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |
| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
...
Any given character set always has at least one collation. It may have
several collations. To list the collations for a character set, use the
`SHOW COLLATION' statement. For example, to see the collations for the
`latin1' (cp1252 West European) character set, use this statement to
find those collation names that begin with `latin1':
mysql> SHOW COLLATION LIKE 'latin1%';
+---------------------+---------+----+---------+----------+---------+
| Collation | Charset | Id | Default | Compiled | Sortlen |
+---------------------+---------+----+---------+----------+---------+
| latin1_german1_ci | latin1 | 5 | | | 0 |
| latin1_swedish_ci | latin1 | 8 | Yes | Yes | 1 |
| latin1_danish_ci | latin1 | 15 | | | 0 |
| latin1_german2_ci | latin1 | 31 | | Yes | 2 |
| latin1_bin | latin1 | 47 | | Yes | 1 |
| latin1_general_ci | latin1 | 48 | | | 0 |
| latin1_general_cs | latin1 | 49 | | | 0 |
| latin1_spanish_ci | latin1 | 94 | | | 0 |
+---------------------+---------+----+---------+----------+---------+
The `latin1' collations have the following meanings:
*Collation* *Meaning*
`latin1_german1_ci' German DIN-1
`latin1_swedish_ci' Swedish/Finnish
`latin1_danish_ci' Danish/Norwegian
`latin1_german2_ci' German DIN-2
`latin1_bin' Binary according to `latin1' encoding
`latin1_general_ci' Multilingual (Western European)
`latin1_general_cs' Multilingual (ISO Western European), case
sensitive
`latin1_spanish_ci' Modern Spanish
Collations have these general characteristics:
* Two different character sets cannot have the same collation.
* Each character set has one collation that is the _default
collation_. For example, the default collation for `latin1' is
`latin1_swedish_ci'. The output for `SHOW CHARACTER SET' indicates
which collation is the default for each displayed character set.
* There is a convention for collation names: They start with the
name of the character set with which they are associated, they
usually include a language name, and they end with `_ci' (case
insensitive), `_cs' (case sensitive), or `_bin' (binary).
Info Catalog
(mysql.info) charset-general
(mysql.info) charset
(mysql.info) charset-syntax
automatically generated byinfo2html