Search code examples
cuuid

v5 UUID. What is difference between UUID of the namespace and name


I am trying to generate a v5 UUID by referring to the function (http://www.ietf.org/rfc/rfc4122.txt) :

/* uuid_create_sha1_from_name -- create a version 5 (SHA-1) UUID
   using a "name" from a "name space" */
void uuid_create_sha1_from_name(
    uuid_t *uuid,         /* resulting UUID */
    uuid_t nsid,          /* UUID of the namespace */
    void *name,           /* the name from which to generate a UUID */
    int namelen           /* the length of the name */
);

I have read the help, but I am still not clear on what is the difference between the 2nd(uuid_t nsid) and 3rd (void *name) parameters of above function?

Could someone explain me the above with an example ?

I would also like to understand what the below means in the RFC4122 link and does it have any significance to the 2nd parameter ?

/* Name string is a URL */
uuid_t NameSpace_URL = { /* 6ba7b811-9dad-11d1-80b4-00c04fd430c8 */
    0x6ba7b811,
    0x9dad,
    0x11d1,
    0x80, 0xb4, 0x00, 0xc0, 0x4f, 0xd4, 0x30, 0xc8
};

Solution

  • The name is the key that is unique to whatever thing you're generating uuid's for

    The namespace is a constant UUID that identifies the context in which you're generating UUIDs

    If you look at the RFC, you'll see section 4.3 defines these characteristics of a name-baesed UUID:

    • The UUIDs generated at different times from the same name in the same namespace MUST be equal.
    • The UUIDs generated from two different names in the same namespace should be different (with very high probability).
    • The UUIDs generated from the same name in two different namespaces should be different with (very high probability).
    • If two UUIDs that were generated from names are equal, then they were generated from the same name in the same namespace (with very high probability).

    These are all important properties to have in a name-based UUID. For example, let's say you and I are implementing HR systems for our respective companies. The systems are completely unrelated to one another, but because UUIDs are awesome, we're both using name-based UUIDs to identify employees. And because it's a rather obvious thing to do, we use employee names as the name from which the UUIDs are generated.

    Without namespaces we would both create the same UUID for anyone named "John Smith"... but that'd be Bad (tm) since our systems are unrelated and we're dealing with different John Smiths. "So what," you say! ... but what happens when our companies merge next year and we have to combine our HR databases? Well, at that point we find ourselves merging database records that have the same ID and pretty soon the paychecks for every John Smith in the company are crossing in the mail and HR is handing us our pink slips.

    To prevent this sort of thing from happening, the RFC specifies that we each independently choose a UUID to use as our namespace. Namespaces will typically be fixed and associated with a specific system in which UUIDs are being generated, so we'll probably just hardcode this as a constant in some configuration file somewhere. Thus, within my namespace (e.g. 87c9cdf7-101d-4c05-a89d-c7aaff3a3fcf) I can trust that the UUID I generate for John Smith will always be the same. But I can also count on it being different from any UUID you create since you'll be using a different namespace. And so if/when our systems merge, there won't be any issues.