From c03e3f4ebf6f04c5492b3c45a34f0dcc85bb261d Mon Sep 17 00:00:00 2001 From: Morgan Tocker Date: Fri, 11 Oct 2013 23:23:54 -0400 Subject: Clarified that utf-8 implimentation is standard (called the bmp) even if it is limited. The 4 byte version is available, but the fact that it's always a variable charset, but you have to choose which variable charset makes the argument somewhat true. This is intentional on mysql's behalf of course, since we do actually offer fileformat backwards compatibility. --- wiki/mysql/choose-something-else.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'wiki') diff --git a/wiki/mysql/choose-something-else.md b/wiki/mysql/choose-something-else.md index 795dcc7..ceb3966 100644 --- a/wiki/mysql/choose-something-else.md +++ b/wiki/mysql/choose-something-else.md @@ -65,9 +65,9 @@ familiar with other SQL implementations). states](http://dev.mysql.com/doc/refman/5.5/en/server-sql-mode.html), making it harder to carry expectations from manual testing over to code or from tool to tool. -* MySQL uses non-standard and rather unique interpretations of several common - character encodings, including UTF-8 and Latin-1. Implementation details of - these encodings within MySQL, such as the `utf8` encoding's MySQL-specific +* MySQL recommends UTF-8 as a character-set, but still defaults to Latin-1. The implimentation +of `utf8` up until MySQL 5.5 was only the 3-byte [BMP](http://en.wikipedia.org/wiki/Basic_Multilingual_Plane#Basic_Multilingual_Plane). MySQL 5.5 and beyond supports a 4-byte `utf8`, but confusingly must be set with the character-set `utf8mb4`. Implementation details of + these encodings within MySQL, such as the `utf8` 3-byte limit, tend to leak out into client applications. Data that does not fit MySQL's understanding of the storage encoding will be transformed until it does, by truncation or replacement, by default. -- cgit v1.2.3