I'm going to store data (mostly Wikipedia page titles) in a table, that can contain characters for which full UTF8 is needed. The schema I'm using is
CREATE TABLE `en_brands` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(191) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL,
`name_encoded` varchar(255) NOT NULL,
`inserted` datetime NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `name` (`name`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT
As you can see, name
is only 191 characters long. When using larger values, then MySQL refuses the creation of UNIQUE KEY name
because such keys can only be 767 bytes long. What are my questions:
name_encoded
to be at least TEXT
to fully store an URL-encoded (in PHP rawurlencode
) UTF8 string? (I think an 255 byte long string could be represented by a 3060 character long URL in worst case - 255 chars x 4 bytes x 3 chars for encoded representation)name_encoded
(I think not, because URL-encoded they should fit into latin
)name
to store at least 255 characters with full UTF8 support and to create an UNIQUE KEY
nevertheless (I'd like to use a collation which allows native language sorting)BTW: I'm using MySQL 5.6 Percona on Debian Wheezy
To answer my question:
has name_encoded to be at least TEXT to fully store an URL-encoded (in PHP rawurlencode) UTF8 string?
for sure, an URL encoded UTF8 string can be up to 3060 characters long, so TEXT
is required
does it matter which collation I use for name_encoded (I think not, because URL-encoded they should fit into latin)
URL encoded string fits into ASCII
which data type and collation should I use for name to store at least 255 characters with full UTF8 support and to create an UNIQUE KEY nevertheless
this is not possible, because of length restriction for text columns in UNIQUE KEY
s
I'm checking for duplicates by executing a SELECT
first, so there is no need for an UNIQUE KEY
, but application has to ensure data integrity