Search code examples
javascriptmysqltypescriptmysql2node-mysql

My SQL statement is not comparing unicode emojis properly?


fullEmoji is a property stored in discord.js ParsedEmoji object which should just be a literal emoji '😄'.

The ParsedEmoji is just an interface but I get that from calling parseEmoji() which is a private it's mainly for the discord custom emojis since if the ParsedEmoji.id is not valid I know it's either a unicode or not an emoji and then I just do a regex test with emoji-regex to see if it's a unicode emoji

const fullEmoji: string = parsedEmoji.id ? 
`<:${parsedEmoji.name}:${parsedEmoji.id}>` : parsedEmoji.name;

connPool.query<ResultSetHeader>(`
    DELETE FROM emoji_role_links WHERE messages_reactable_id = ${reactableMsg.id}
    AND emoji = '${fullEmoji}';
`)

When I run this query it ends up deleting all emoji-link rows and I'm assuming it's because the AND emoji = '${fullEmoji}' is not properly working.

The charset I am using is utf8mb4 and collation utf8mb4_0900_ai_ci

Listing the Rows below:


id, emoji, role_id, messages_reactable_id

'3', '😄', '1083419092192600086', '2'

'4', '🧉', '1099145715982278696', '2'


SHOW CREATE TABLE:

Table, Create Table

CREATE TABLE emoji_role_links (
 id int unsigned NOT NULL AUTO_INCREMENT,
 emoji varchar(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
 role_id varchar(22) COLLATE utf8mb4_unicode_ci NOT NULL,
 messages_reactable_id int unsigned DEFAULT NULL,
 PRIMARY KEY (id),
 UNIQUE KEY role_id_UNIQUE (role_id),
 KEY fk_messages_reactable_id_idx (messages_reactable_id),
 CONSTRAINT fk_messages_reactable_id FOREIGN KEY (messages_reactable_id) REFERENCES messages_reactable (id)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

So this is showing the collation to be utf8mb4_unicode_ci

I don't know what more to try I have tried a little researching and couldn't find much directly related to this I don't know what exact problem I have here since this is all in the query itself

one thing I would try if there is no way to get this to work is to escape the emoji's and unescaped them again later but I would have to rewrite a lot of code so I'm saving that as a last resort


Solution

  • SHOW VARIABLES LIKE 'coll%'; to see if the connection the test will be performed with. Note that the column is using utf8mb4_unicode_ci.

    Note that 'ai_ci' treats those emogi as different, but the [vary old] 'unicode' collation treats them as equal.

    mysql> SELECT '😄' = '🧉' COLLATE utf8mb4_0900_ai_ci;
    +--------------------------------------+
    | '?' = '?' COLLATE utf8mb4_0900_ai_ci |
    +--------------------------------------+
    |                                    0 |
    +--------------------------------------+
    1 row in set (0.00 sec)
    
    mysql> SELECT '😄' = '🧉' COLLATE utf8mb4_unicode_ci;
    +--------------------------------------+
    | '?' = '?' COLLATE utf8mb4_unicode_ci |
    +--------------------------------------+
    |                                    1 |
    +--------------------------------------+
    1 row in set (0.00 sec)
    

    unicode refers to Unicode standard 4.0; 0900 refers to 9.0.