Search code examples
securityuuid

How can UUID be globally unique?


I know the collision rate of UUID is practically zero as a fact, but why is such a low collision rate guaranteed globally?

According to the RFC 4122,

The version 4 UUID is meant for generating UUIDs from truly-random or pseudo-random numbers.

and if you use PRNG, there is a seed. So there should be a case where two UUID generators (accidentally) share the same seed. In such a case, what happens?


Solution

  • So there should be a case where two UUID generators (accidentally) share the same seed. In such a case, what happens?

    Nothing, the same UUID4 will simply be generated multiple times. This does not even depend on "bad" randomness, as could be caused by a deterministic seed of a pseudo-random number generator. "Bad" randomness just makes this more probable, but collisions always have a non-zero probability.

    If I implement UUID generator which is seeded in a cryptographically unsafe way (e.g. just use the integer 0 instead of /dev/random), is it standard-compliant?

    Technically yes, the RFC does not specify requirements on the pseudo-random numbers used. Under 6. Security Considerations, it says (emphasis mine):

    Do not assume that UUIDs are hard to guess; they should not be used as security capabilities (identifiers whose mere possession grants access), for example. A predictable random number source will exacerbate the situation.

    which implies that a predictable random number source still fulfills the requirements of the RFC regarding pseudo-randomness.

    The same sections also says:

    Distributed applications generating UUIDs at a variety of hosts must be willing to rely on the random number source at all hosts. If this is not feasible, the namespace variant should be used.

    which moves the responsibility of how random numbers are generated and how probable this makes collisions in the generated UUIDs away from the RFC, towards implementations. See also this Software Engineering StackExchange answer.

    Personally, I'd expect implementations to use reasonably high-quality randomness. For example, Java's java.util.UUID.randomUUID and CPython's uuid.uuid4 use cryptographically secure randomness.