Search code examples
rubyuuidguidifc

How can I convert a UUID to a string using a custom character set in Ruby?


I want to create a valid IFC GUID (IfcGloballyUniqueId) according to the specification here: http://www.buildingsmart-tech.org/ifc/IFC2x3/TC1/html/ifcutilityresource/lexical/ifcgloballyuniqueid.htm

It's basically a UUID or GUID (128 bit) mapped to a set of 22 characters to limit storage space in a text file.

I currently have this workaround, but it's merely an approximation:

guid = '';22.times{|i|guid<<'0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_$'[rand(64)]}

It seems best to use ruby SecureRandom to generate a 128 bit UUID, like in this example (https://ruby-doc.org/stdlib-2.3.0/libdoc/securerandom/rdoc/SecureRandom.html):

SecureRandom.uuid #=> "2d931510-d99f-494a-8c67-87feb05e1594"

This UUID needs to be mapped to a string with a length of 22 characters according to this format:

           1         2         3         4         5         6 
 0123456789012345678901234567890123456789012345678901234567890123
"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_$";

I don't understand this exactly. Should the 32-character long hex-number be converted to a 128-character long binary number, then devided in 22 sets of 6 bits(except for one that gets the remaining 2 bits?) for which each can be converted to a decimal number from 0 to 64? Which then in turn can be replaced by the corresponding character from the conversion table?

I hope someone can verify if I'm on the right track here.

And if I am, is there a computational faster way in Ruby to convert the 128 bit number to the 22 sets of 0-64 than using all these separate conversions?


Edit: For anyone having the same problem, this is my solution for now:

require 'securerandom'

# possible characters in GUID
guid64 = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_$'
guid = ""

# SecureRandom.uuid: creates a 128 bit UUID hex string
# tr('-', ''): removes the dashes from the hex string
# pack('H*'): converts the hex string to a binary number (high nibble first) (?) is this correct?
#   This reverses the number so we end up with the leftover bit on the end, which helps with chopping the sting into pieces.
#   It needs to be reversed again to end up with a string in the original order.
# unpack('b*'): converts the binary number to a bit string (128 0's and 1's) and places it into an array
# [0]: gets the first (and only) value from the array
# to_s.scan(/.{1,6}/m): chops the string into pieces 6 characters(bits) with the leftover on the end.

[SecureRandom.uuid.tr('-', '')].pack('H*').unpack('b*')[0].to_s.scan(/.{1,6}/m).each do |num|

  # take the number (0 - 63) and find the matching character in guid64, add the found character to the guid string
  guid << guid64[num.to_i(2)]
end
guid.reverse

Solution

  • Base64 encoding is pretty close to what you want here, but the mappings are different. No big deal, you can fix that:

    require 'securerandom'
    require 'base64'
    
    # Define the two mappings here, side-by-side
    BASE64 = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
    IFCB64 = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz_$'
    
    def ifcb64(hex)
      # Convert from hex to binary, then from binary to Base64
      # Trim off the == padding, then convert mappings with `tr`
      Base64.encode64([ hex.tr('-', '') ].pack('H*')).gsub(/\=*\n/, '').tr(BASE64, IFCB64)
    end
    
    ifcb64(SecureRandom.uuid)
    # => "fa9P7E3qJEc1tPxgUuPZHm"