Search code examples
mysqlulid

Append a column with an ULID to an existing MySQL table


I'm trying to append a column with an ULID to an existing MySQL table. I know how to do it with a function and an update query:

delimiter //
DROP FUNCTION IF EXISTS ulid//
CREATE FUNCTION ulid () RETURNS CHAR(26) DETERMINISTIC
BEGIN
DECLARE s_hex CHAR(32);
SET s_hex = LPAD(HEX(CONCAT(UNHEX(CONV(ROUND(UNIX_TIMESTAMP(CURTIME(4))*1000), 10, 16)), RANDOM_BYTES(10))), 32, '0');
RETURN REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(CONCAT(LPAD(CONV(SUBSTRING(s_hex, 1, 2), 16, 32), 2, '0'), LPAD(CONV(SUBSTRING(s_hex, 3, 15), 16, 32), 12, '0'), LPAD(CONV(SUBSTRING(s_hex, 18, 15), 16, 32), 12, '0')), 'V', 'Z'), 'U', 'Y'), 'T', 'X'), 'S', 'W'), 'R', 'V'), 'Q', 'T'), 'P', 'S'), 'O', 'R'), 'N', 'Q'), 'M', 'P'), 'L', 'N'), 'K', 'M'), 'J', 'K'), 'I', 'J');
END//
delimiter;

UPDATE mytable SET new_id=(SELECT ulid());

However, I'd need to do it with a single query. Is it even possible?


Solution

  • Although you are using the value for s_hex 3 times in your final expression, and even if it contains a random value, you can just literally reuse the formula for s_hex 3 times in your final formula:

    UPDATE mytable SET new_id = REPLACE(REPLACE(REPLACE( ...
      LPAD(CONV(SUBSTRING(  LPAD(HEX(..., 32, '0')  , 1, 2), 16, 32), 2, '0'),
      LPAD(CONV(SUBSTRING(  LPAD(HEX(..., 32, '0')  , 3, 15), 16, 32), 12, '0'), 
      LPAD(CONV(SUBSTRING(  LPAD(HEX(..., 32, '0')  , 18, 15), 16, 32), 12, '0')
    ), 'V', 'Z')..., 'I', 'J');
    

    The random_bytes()-function will return different values for the 3 parts, but since there is no overlap, and all random values are valid, the whole result is still valid (e.g. if substring 2 uses the first 2 letters of a random string ABCDEF and substring 3 uses the last 4 letters of a random string 123456, it is the same as if the original random string had been AB3456).

    The final expression is long and not nice to look at, but works. If you want (and have the time), you can simplify it further, e.g. SUBSTRING(s_hex, 1, 2) is 01 for the next 10 years, and 02 after that, so you do not need to use the full expression for that.