Search code examples
c#performanceunit-testingdatetimeguid

How to write an extremely efficient AND testable GUID generator


I want to test a class with unit tests to ensure that it will work correctly in all kinds of circumstances. The class should generate unique results regardless of where (physical machine, vm, container,...) the class is running. It should even generate unique results if run multiple times in the same environment in parallel.

The problem is. The class is static and it's using a DateTime.Now call. If I would make the class non-static and use the TimeProvider struct it would all be nice and testable but... before you start, here is the catch

  1. The class will be called many thousands of times per second. The production machine holding the record at the moment with 151964235 calls per second over a time of 2 Minutes. Static was chosen to not stress the GC.
  2. Every solution provided needs to be at least as fast as the current implementation. This includes the actual runtime of the solution time for object creation and the stress on the GC.
  3. The Solutions needs to be portable and able to run on Windows, Linux and Mac alike.
  4. Other Ideas to make it even faster are of course highly welcome.
  5. The main concern is about DateTime.Now. I will think about UniqueId and Environment.CurrentManagedThreadId later.
namespace HighPerformance.Utils;

using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Runtime.Intrinsics;
using System.Security.Cryptography;

/// <summary>
///     <see cref="SequentialGuid" /> generates instances of <see cref="Guid" /> that are in ascending order.
/// </summary>
/// <remarks>
///     It is both unique and ordered.
/// </remarks>
public static class SequentialGuid
{
    private const long Rfc4122TicksBase = 499163040000000000; //1582-10-15 00:00:00
    private const long SequentialGuidVersion = 0x6000;
    private const int Variant = 0xC000;
    private const int VariantMask = 0xC000;
    private const long VersionMask = ~0x0FFFL;
    private static readonly MapperAb EmptyMapperAb = new();
    private static readonly MapperC EmptyMapperC;
    private static readonly int Randomizer;
    private static long _lastTicks;
    private static int _sequence;
    private static SpinLock _spinLock;

    static SequentialGuid()
    {
        _spinLock = new SpinLock(false);
        Randomizer = RandomNumberGenerator.GetInt32(int.MaxValue);
        int rnd = RandomNumberGenerator.GetInt32(int.MaxValue);
        rnd ^= Environment.ProcessId;
        EmptyMapperC.UniqueId = (ushort)((rnd & 0xFFFF) ^ (rnd >> 16));
    }

    public static Guid NewGuid()
    {
        var mapperAb = EmptyMapperAb;
        var mapperC = EmptyMapperC;
        mapperAb.Ticks = GetTicksAndSequence(out mapperC.Sequence);

        long ht = mapperAb.Ticks & ~VersionMask;
        mapperAb.Ticks &= VersionMask;
        mapperAb.Ticks <<= 4;
        mapperAb.Ticks |= SequentialGuidVersion | ht;

        var d = (uint)(Randomizer ^ Environment.CurrentManagedThreadId);
        Vector128<byte> vec = Vector128.Create(mapperAb.A, mapperAb.B, mapperC.C, d).AsByte();
        if (BitConverter.IsLittleEndian)
        {
            Vector128<byte> result = Vector128.Shuffle(vec, Vector128.Create((byte)0, 1, 2, 3, 6, 7, 4, 5, 11, 10, 9, 8, 15, 14, 13, 12));
            return Unsafe.As<Vector128<byte>, Guid>(ref result);
        }

        return Unsafe.As<Vector128<byte>, Guid>(ref vec);
    }

    private static long GetTicksAndSequence(out ushort sequence)
    {
        // In the year 5236-03-31 21:21:00 I have to think about a different solution.
        long ticks = DateTime.UtcNow.Ticks - Rfc4122TicksBase;
        var lockTaken = false;
        _spinLock.Enter(ref lockTaken);
        if (ticks > _lastTicks)
        {
            _sequence = 0;
            _lastTicks = ticks;
        }
        else
        {
            if (_sequence == (VariantMask ^ 0xFFFF)) // rollover will happen, so we increase ticks
            {
                _sequence = 0;
                ++_lastTicks;
            }

            ticks = _lastTicks;
        }

        sequence = (ushort)(_sequence++ | Variant);
        if (lockTaken) { _spinLock.Exit(); }

        return ticks;
    }

    [StructLayout(LayoutKind.Explicit)]
    private struct MapperAb(long ticks)
    {
        [FieldOffset(0)] public long Ticks = ticks;
        [FieldOffset(0)] public uint B;
        [FieldOffset(sizeof(uint))] public uint A;
    }

    [StructLayout(LayoutKind.Explicit)]
    private struct MapperC
    {
        [FieldOffset(0)] public uint C;

        // ReSharper disable once MemberHidesStaticFromOuterClass
        [FieldOffset(0)] public ushort UniqueId;
        [FieldOffset(sizeof(ushort))] public ushort Sequence;
    }
}

Update this are my current benchmark results.

<!DOCTYPE html>
<html lang='en'>

<head>
  <meta charset='utf-8' />
  <title>Cobra.Exoda.Matching.Benchmarks.BenchmarkSequentialGuidCreation-20240925-120545</title>

  <style type="text/css">
    table {
      border-collapse: collapse;
      display: block;
      width: 100%;
      overflow: auto;
    }
    
    td,
    th {
      padding: 6px 13px;
      border: 1px solid #ddd;
      text-align: right;
    }
    
    tr {
      background-color: #fff;
      border-top: 1px solid #ccc;
    }
    
    tr:nth-child(even) {
      background: #f8f8f8;
    }
  </style>
</head>

<body>
  <pre><code>
BenchmarkDotNet v0.14.0, Windows 11 (10.0.22631.4169/23H2/2023Update/SunValley3)
AMD Ryzen 7 7800X3D, 1 CPU, 16 logical and 8 physical cores
.NET SDK 9.0.100-rc.1.24452.12
  [Host]            : .NET 9.0.0 (9.0.24.43107), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
  ShortRun-.NET 9.0 : .NET 9.0.0 (9.0.24.43107), X64 RyuJIT AVX-512F+CD+BW+DQ+VL+VBMI
</code></pre>
  <pre><code>Job=ShortRun-.NET 9.0  Runtime=.NET 9.0  IterationCount=3  
LaunchCount=1  WarmupCount=3  
</code></pre>

  <table>
    <thead>
      <tr>
        <th>Method </th>
        <th>Mean</th>
        <th>Error</th>
        <th>StdDev</th>
        <th>Max</th>
        <th>Min</th>
        <th>Ratio</th>
        <th>Allocated</th>
        <th>Alloc Ratio</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td>BenchmarUuidV4</td>
        <td>34.39 ns</td>
        <td>4.925 ns</td>
        <td>0.270 ns</td>
        <td>34.69 ns</td>
        <td>34.18 ns</td>
        <td>0.55</td>
        <td>-</td>
        <td>NA</td>
      </tr>
      <tr>
        <td>BenchmarkUuidV7</td>
        <td>63.09 ns</td>
        <td>13.467 ns</td>
        <td>0.738 ns</td>
        <td>63.94 ns</td>
        <td>62.60 ns</td>
        <td>1.00</td>
        <td>-</td>
        <td>NA</td>
      </tr>
      <tr>
        <td>BenchmarkUuidV7Custom</td>
        <td>30.75 ns</td>
        <td>7.927 ns</td>
        <td>0.434 ns</td>
        <td>31.24 ns</td>
        <td>30.41 ns</td>
        <td>0.49</td>
        <td>-</td>
        <td>NA</td>
      </tr>
    </tbody>
  </table>
</body>

</html>

This is the source code of the Benchmark

namespace Cobra.Exoda.Matching.Benchmarks;

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNetVisualizer;
using JetBrains.Annotations;
using Matching.Utils;

// ReSharper disable once ClassCanBeSealed.Global
[MaxColumn]
[MemoryDiagnoser(false)]
[MinColumn]
[RichHtmlExporter("Benchmark of UUID Creation",
                  ["Job"],
                  ["Mean", "Allocated"],
                  ["Mean", "Min", "Max", "Allocated"],
                  dividerMode: RenderTableDividerMode.SeparateTables,
                  htmlWrapMode: HtmlDocumentWrapMode.RichDataTables)]
[ShortRunJob(RuntimeMoniker.Net90)]
[UsedImplicitly]
public class BenchmarkSequentialGuidCreation
{
    [Benchmark]
    public Guid BenchmarUuidV4() => Guid.NewGuid(); // UUIDv4

    [Benchmark(Baseline = true)]
    public Guid BenchmarkUuidV7() => Guid.CreateVersion7(); // UUIDv7

    [Benchmark]
    public Guid BenchmarkUuidV7Custom() => SequentialGuid.CreateVersion7(); // UUIDv7
}

Update: I did have a look into the UUID v7 in .net9 that was mentioned by @Fildor. However, UUID v7 does pretty much the same I do but is significant slower than my implementation.

Update: I did have a look at the EF Core 8 implementation mentioned by @PanagiotisKanavos. This implementation uses not a real time component, instead it uses the time when the object is initialized as a base for all other Ids created. This can be problematic in a scenario where processes that use this class will be stopped and started again. There is also nothing else in place to ensure uniqueness.


Solution

  • Pass the ticks or even a DateTimeOffset as a parameter to NextGuid instead of using Datetime.UtcNow.Ticks internally. Create a parameterless overload for general use that uses Now as the default timestamp

    public static Guid NewGuid()=>NewGuid(DateTimeOffset.UtcNow);
    
    public static Guid NewGuid(DateTimeOffset timestamp)
    {
    ...
    }
    

    That's what .NET 9's Guid.CreateVersion7 does. You could even copy that code for use in .NET 8 or .NET 6 (for the 2 months of support it has), but note that adding v7 support required changes to Guid.Unix.cs.

    public static Guid CreateVersion7() => CreateVersion7(DateTimeOffset.UtcNow);
    

    The CreateVersion7(DateTimeOffset) implementation uses a v4 GUID whose 48 MSBs are replaced by the Unix timestamp in milliseconds, thus producing a v7 sequential UUID. The comments even discuss the use of native GUID functions vs RandomNumberGenerator (it's slower).

    /// <summary>Creates a new <see cref="Guid" /> according to RFC 9562, following the Version 7 format.</summary>
    /// <param name="timestamp">The date time offset used to determine the Unix Epoch timestamp.</param>
    /// <returns>A new <see cref="Guid" /> according to RFC 9562, following the Version 7 format.</returns>
    /// <exception cref="ArgumentOutOfRangeException"><paramref name="timestamp" /> represents an offset prior to <see cref="DateTimeOffset.UnixEpoch" />.</exception>
    /// <remarks>
    ///     <para>This seeds the rand_a and rand_b sub-fields with random data.</para>
    /// </remarks>
    public static Guid CreateVersion7(DateTimeOffset timestamp)
    {
        // NewGuid uses CoCreateGuid on Windows and Interop.GetCryptographicallySecureRandomBytes on Unix to get
        // cryptographically-secure random bytes. We could use Interop.BCrypt.BCryptGenRandom to generate the random
        // bytes on Windows, as is done in RandomNumberGenerator, but that's measurably slower than using CoCreateGuid.
        // And while CoCreateGuid only generates 122 bits of randomness, the other 6 bits being for the version / variant
        // fields, this method also needs those bits to be non-random, so we can just use NewGuid for efficiency.
        Guid result = NewGuid();
    
        // 2^48 is roughly 8925.5 years, which from the Unix Epoch means we won't
        // overflow until around July of 10,895. So there isn't any need to handle
        // it given that DateTimeOffset.MaxValue is December 31, 9999. However, we
        // can't represent timestamps prior to the Unix Epoch since UUIDv7 explicitly
        // stores a 48-bit unsigned value, so we do need to throw if one is passed in.
    
        long unix_ts_ms = timestamp.ToUnixTimeMilliseconds();
        ArgumentOutOfRangeException.ThrowIfNegative(unix_ts_ms, nameof(timestamp));
    
        Unsafe.AsRef(in result._a) = (int)(unix_ts_ms >> 16);
        Unsafe.AsRef(in result._b) = (short)(unix_ts_ms);
    
        Unsafe.AsRef(in result._c) = (short)((result._c & ~VersionMask) | Version7Value);
        Unsafe.AsRef(in result._d) = (byte)((result._d & ~Variant10xxMask) | Variant10xxValue);
    
        return result;
    }
    

    The GuidTests.CreateVersion7 method uses this to ensure the values are the expected ones. It doesn't try to test for uniqueness, memory consumption.

        [Fact]
        public static void CreateVersion7()
        {
            DateTimeOffset timestamp = DateTimeOffset.UtcNow;
            ReadOnlySpan<char> unix_ts_ms = timestamp.ToUnixTimeMilliseconds().ToString("x12", CultureInfo.InvariantCulture);
    
            Guid guid = Guid.CreateVersion7(timestamp);
            ReadOnlySpan<char> guidString = guid.ToString("", CultureInfo.InvariantCulture);
    
            Assert.Equal(7, guid.Version);
            Assert.True((guid.Variant == 0x8) || (guid.Variant == 0x9) || (guid.Variant == 0xA) || (guid.Variant == 0xB));
    
            Assert.Equal(unix_ts_ms.Slice(0, 8), guidString.Slice(0, 8));
            ...
    

    The unit test just checks that the bytes are correct. To ensure your algorithm works though you need to ensure that two threads won't produce the same value on the same machine. Uniqueness just doesn't exist - UUIDs aren't unique, they just have such a small chance of collision they're treated as unique. Sequential UUIDs have a higher but still very small chance of collision.

    One way you can test your algorithm is to create eg 10, 20 or more tasks/threads that generate UUIDs and add them to a ConcurrentDictionary. If the algorithm is prone to race conditions, you'll get a duplicate key exception after a while.

    var dict = new ConcurrentDictionary<Guid, Guid>();
    
    Parallel.For(0, 1000000, new () { MaxDegreeOfParallelism = 32 }, _ =>
    {
        var guid = Guid.CreateVersion7();
        if (!dict.TryAdd(guid, guid))
        {
            throw new InvalidOperationException("Duplicate GUID!");
        }
    });
    
    

    Benchmarking

    To test performance and allocation behavior, the defacto standard is Anderi Akinshin's BenckmarkDotNet. Instead of just running code multiple times and calculating an average, it runs tests long enough to gather statistically significant results. It will even warn if the results seem to vary too wildly. To account for warmup and cooldown effects, the tests will be run multiple times before it starts collecting measures.

    This example benchmarks Guid.NewGuid() vs Guid.CreateVersion7() and EF8's SequentialGuidValueGenerator, collecting memory statistics as well :

    using BenchmarkDotNet.Attributes;
    using BenchmarkDotNet.Running;
    
    var summary = BenchmarkRunner.Run(typeof(Program).Assembly);
    
    [MemoryDiagnoser]
    public class GuidVsSequential
    {
    
        SequentialGuidValueGenerator _gen;
        public GuidVsSequential()
        {
            _gen = new SequentialGuidValueGenerator();
        }
    
        [Benchmark(Baseline = true)]
        public Guid V4() => Guid.NewGuid();
    
        [Benchmark]
        public Guid Sequential() => Guid.CreateVersion7();
    
        [Benchmark]
        public Guid EF8() => _gen.Next();
    }
    

    The result summary is this :

    // * Summary *
    
    BenchmarkDotNet v0.14.0, Windows 11 (10.0.22635.3936)
    Intel Core i7-10850H CPU 2.70GHz, 1 CPU, 12 logical and 6 physical cores
    .NET SDK 9.0.100-rc.1.24452.12
      [Host]     : .NET 9.0.0 (9.0.24.43107), X64 RyuJIT AVX2
      DefaultJob : .NET 9.0.0 (9.0.24.43107), X64 RyuJIT AVX2
    
    Method Mean Error StdDev Ratio RatioSD Allocated Alloc Ratio
    V4 67.73 ns 1.374 ns 2.987 ns 1.00 0.06 - NA
    Sequential 101.57 ns 2.030 ns 2.711 ns 1.50 0.08 - NA
    EF8 79.48 ns 1.619 ns 3.847 ns 1.18 0.08 - NA

    Interestingly, the EF8 code is a big faster than Guid.CreateVersion7 but doesn't produce valid UUID v7 values, as it doesn't convert to a Unix timestamp and doesn't set the version bytes.