Search code examples
postgresqltypesuuiddicom

PostgreSQL: Convert UUID into OID


Is there a function in PostgreSQL to convert a UUID (RFC 4122) into a OID (ISO 8824) ?

The value after "2.25." is the straight decimal encoding of the UUID as an integer. It MUST be a direct decimal encoding of the single integer, all 128 bits. It must not be broken up into parts.

For example the function would take UUID "f81d4fae-7dec-11d0-a765-00a0c91e6bf6" and return the OID "2.25.329800735698586629295641978511506172918".

References:


Solution

  • Ideally, we would have an unsigned 16-byte integer (uint16) and a registered cast uuid --> uint16 (which may or may not be binary compatible internally, making it super-cheap). None of this is implemented in stock PostgreSQL.

    You might look to the (unofficial!) additional module pg_bignum or Evan Caroll's (even more unofficial) fork to accept hex input directly. (Disclaimer: untested.)

    These modules are not available on most hosted installations. Here is a poor man's implementation with built-in tools of standard PostgreSQL:

    CREATE OR REPLACE FUNCTION f_uuid2oid(_uuid uuid)
      RETURNS text
      LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE AS
    $func$
    SELECT '2.25.' ||
           ('x0'      ||       left(hex, 15)     )::bit(64)::int8 * numeric '295147905179352825856' -- 1.
         + ('x0'      || right(left(hex, 30), 15))::bit(64)::int8 * numeric '256'                   -- 2.
         + ('x000000' ||       right(hex,  2)    )::bit(32)::int4                                   -- 3.
    FROM   translate(_uuid::text, '-', '') t(hex)
    $func$;
    
    COMMENT ON FUNCTION public.f_uuid2oid(uuid) IS '
    Convert UUID (RFC 4122) into a OID (ISO 8824) ?
    First, get text representation of UUID without hyphens:  
      translate(_uuid::text, '-', '')`
    Then:
    1.
    - take the first 15 hex digits
    - prefix with x0
    - cast to bit(64)
    - cast to int8
    - multiply with numeric 295147905179352825856 (= 2^68), which is the same as left-shift the binary representation by 68 bits.
      68 bits because: 1 hex digit represents 4 bit; uuid has 128; 128 - 15*4 = 68; so shift by 68
    2.  
    - take the next 15 hex digits
    - prefix with x0
    - cast to bit(64)
    - cast to int8
    - multiply with numeric 256 (= 2^8) shift by the remaining 2 hex digits / 8 bit
    3. 
    - take the remaining, rightmost 2 hex digits
    - prefix with x000000
    - cast to bit(32)
    - cast to int4
    
    Add 1. + 2. + 3., convert to text, prefix "2.25." Voila.
    No leading zeros, according to https://www.rfc-editor.org/rfc/rfc3061
    
    More explanation:
    - https://stackoverflow.com/questions/8316164/convert-hex-in-text-representation-to-decimal-number/8335376#8335376
    - https://dba.stackexchange.com/questions/115271/what-is-the-optimal-data-type-for-an-md5-field/115316#115316
    ';
    

    Call:

    SELECT f_uuid2oid('f81d4fae-7dec-11d0-a765-00a0c91e6bf6');
    

    Produces the requested OID 2.25.329800735698586629295641978511506172918

    db<>fiddle here

    No leading zeros, according to https://www.rfc-editor.org/rfc/rfc3061.
    I did not read through all the various standards: http://www.oid-info.com/faq.htm#1

    I optimized performance to the best of my knowledge, leveraging the built-in (very fast) binary coercible cast from bit(n) to bigint /integer. To understand what I am doing there, first read:

    Postgres integer types are signed. So - to avoid overflowing to negative numbers - we cannot use the full 64 bit (8 byte / 16 hex digits) and we have to convert the 32 hex digits in three junks instead of just two. I arbitrarily slice it up as 15 + 15 + 2 hex digits.

    Using left() and right() as that's typically a tiny bit faster than substring().

    Also consider the comment to the function.