Search code examples
javamongodbspring-data

Using MongoClient to read UUID written to with Spring Data


I am having trouble reading UUID values written to a MongoDB collection with Spring Data MongoDB. For UUID ca119807-967a-46df-b659-f0e5e4163a28 the value in the database is _id : Binary('30Z6lgeYEcooOhbk5fBZtg==', 3).

This is how I am using Spring Data MongoDB to write UUIDs to the database:

@Document("foos")
class Foo {

@Id
private UUID uuid;
...

After fetching the same document with com.mongodb.client.MongoClient, I am able to access a field of type org.bson.types.Binary. It then looks like this:

enter image description here

This is how I am instantiating the variables listed above:

Binary id = doc.get("_id", Binary.class);
String base64String = Base64.encodeBase64String(id.getData());

I am not able to convert this value back to the original UUID. Passing the byte[] to java.util.UUID#nameUUIDFromBytes produces results that are nothing like the original UUID.

How can I convert the byte[] in org.bson.types.Binary to UUID?

I managed to use Studio 3T to read UUID values. For this, I had to change Legacy UUID Encoding to "Legacy Java Encoding". Before I did that, I was seeing the same UUID values as the ones I am getting from java.util.UUID#nameUUIDFromBytes.


Solution

  • BSON binary subtype 3 is the legacy UUID format. The legacy byte ordering was not defined, and so each client language was free to order the bytes in whatever way made sense to that language's implementation.

    The base64 data 30Z6lgeYEcooOhbk5fBZtg== in hex is:

    df46 7a96 0798 11ca 283a 16e4 e5f0 59b6
    

    If you compare that with the original UUID ca119807-967a-46df-b659-f0e5e4163a28 you may notice that the first half of the hexadecimal:

    df46 7a96 0798 11ca
    

    is the reverse order of the first half of the UUID:

    ca119807-967a-46df
    

    Likewise for the second half:

    283a 16e4 e5f0 59b6
    b659-f0e5e4163a28
    

    MongoDB Java driver's org.bson.UuidRepresentation enum gives an inkling to the problem, it has possible values:

    C_SHARP_LEGACY
    JAVA_LEGACY
    PYTHON_LEGACY
    STANDARD
    

    The peculiar encoding above is the JAVA_LEGACY format.

    For how to specify which to use, see spring-boot 2.3.6, how to set UUID representation for mongo?