Is there a function in PostgreSQL to convert a UUID (RFC 4122) into a OID (ISO 8824) ?
The value after "2.25." is the straight decimal encoding of the UUID as an integer. It MUST be a direct decimal encoding of the single integer, all 128 bits. It must not be broken up into parts.
For example the function would take UUID "f81d4fae-7dec-11d0-a765-00a0c91e6bf6"
and return the OID "2.25.329800735698586629295641978511506172918"
.
References:
Ideally, we would have an unsigned 16-byte integer (uint16
) and a registered cast uuid --> uint16
(which may or may not be binary compatible internally, making it super-cheap). None of this is implemented in stock PostgreSQL.
You might look to the (unofficial!) additional module pg_bignum
or Evan Caroll's (even more unofficial) fork to accept hex input directly. (Disclaimer: untested.)
These modules are not available on most hosted installations. Here is a poor man's implementation with built-in tools of standard PostgreSQL:
CREATE OR REPLACE FUNCTION f_uuid2oid(_uuid uuid)
RETURNS text
LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE AS
$func$
SELECT '2.25.' ||
('x0' || left(hex, 15) )::bit(64)::int8 * numeric '295147905179352825856' -- 1.
+ ('x0' || right(left(hex, 30), 15))::bit(64)::int8 * numeric '256' -- 2.
+ ('x000000' || right(hex, 2) )::bit(32)::int4 -- 3.
FROM translate(_uuid::text, '-', '') t(hex)
$func$;
COMMENT ON FUNCTION public.f_uuid2oid(uuid) IS '
Convert UUID (RFC 4122) into a OID (ISO 8824) ?
First, get text representation of UUID without hyphens:
translate(_uuid::text, '-', '')`
Then:
1.
- take the first 15 hex digits
- prefix with x0
- cast to bit(64)
- cast to int8
- multiply with numeric 295147905179352825856 (= 2^68), which is the same as left-shift the binary representation by 68 bits.
68 bits because: 1 hex digit represents 4 bit; uuid has 128; 128 - 15*4 = 68; so shift by 68
2.
- take the next 15 hex digits
- prefix with x0
- cast to bit(64)
- cast to int8
- multiply with numeric 256 (= 2^8) shift by the remaining 2 hex digits / 8 bit
3.
- take the remaining, rightmost 2 hex digits
- prefix with x000000
- cast to bit(32)
- cast to int4
Add 1. + 2. + 3., convert to text, prefix "2.25." Voila.
No leading zeros, according to https://www.rfc-editor.org/rfc/rfc3061
More explanation:
- https://stackoverflow.com/questions/8316164/convert-hex-in-text-representation-to-decimal-number/8335376#8335376
- https://dba.stackexchange.com/questions/115271/what-is-the-optimal-data-type-for-an-md5-field/115316#115316
';
Call:
SELECT f_uuid2oid('f81d4fae-7dec-11d0-a765-00a0c91e6bf6');
Produces the requested OID 2.25.329800735698586629295641978511506172918
db<>fiddle here
No leading zeros, according to https://www.rfc-editor.org/rfc/rfc3061.
I did not read through all the various standards: http://www.oid-info.com/faq.htm#1
I optimized performance to the best of my knowledge, leveraging the built-in (very fast) binary coercible cast from bit(n)
to bigint
/integer
. To understand what I am doing there, first read:
Postgres integer types are signed. So - to avoid overflowing to negative numbers - we cannot use the full 64 bit (8 byte / 16 hex digits) and we have to convert the 32 hex digits in three junks instead of just two. I arbitrarily slice it up as 15 + 15 + 2 hex digits.
Using left()
and right()
as that's typically a tiny bit faster than substring()
.
Also consider the comment to the function.