I want to test a class with unit tests to ensure that it will work correctly in all kinds of circumstances. The class should generate unique results regardless of where (physical machine, vm, container,...) the class is running. It should even generate unique results if run multiple times in the same environment in parallel.
The problem is. The class is static and it's using a DateTime.Now call. If I would make the class non-static and use the TimeProvider struct it would all be nice and testable but... before you start, here is the catch
namespace My.Utils;
using System.Runtime.CompilerServices;
using System.Runtime.InteropServices;
using System.Runtime.Intrinsics;
using System.Security.Cryptography;
using System.Threading;
/// <summary>
/// <see cref="SequentialGuid" /> generates instances of <see cref="Guid" /> that are in ascending order.
/// </summary>
/// <remarks>
/// It is both unique and ordered.
/// </remarks>
public static class SequentialGuid
{
private const uint GuidVersion8 = 0x8000;
private const uint GuidVersion7 = 0x7000;
private const uint Variant = 0x80000000;
private const uint VariantMask = 0xC0000000;
private const uint SequenceRollover = 0x1000;
private static readonly MapperAb EmptyMapperAb = new();
private static readonly MapperCd BaseMapperCd;
private static readonly uint C;
private static long _lastMilliseconds;
private static uint _sequence;
private static SpinLock _spinLock;
static SequentialGuid()
{
_spinLock = new SpinLock(false);
int rnd = RandomNumberGenerator.GetInt32(int.MaxValue);
rnd ^= Environment.MachineName.GetHashCode();
BaseMapperCd.C = (uint)rnd;
BaseMapperCd.C &= ~VariantMask;
BaseMapperCd.C |= Variant;
BaseMapperCd.D = (uint)(Environment.ProcessId ^ rnd);
C = (uint)rnd;
C &= ~VariantMask;
C |= Variant;
}
public static Guid CreateVersion7()
{
return CreateVersion7(DateTimeOffset.UtcNow);
}
public static Guid CreateVersion7(in DateTimeOffset timestamp)
{
var mapperAb = EmptyMapperAb;
GetTicksAndSequence(timestamp, ref mapperAb);
mapperAb.B |= GuidVersion7;
var d = (uint)RandomNumberGenerator.GetInt32(int.MaxValue);
Vector128<byte> vec = Vector128.Create(mapperAb.A, mapperAb.B, C, d).AsByte();
if (BitConverter.IsLittleEndian)
{
Vector128<byte> result = Vector128.Shuffle(vec, Vector128.Create((byte)0, 1, 2, 3, 6, 7, 4, 5, 11, 10, 9, 8, 15, 14, 13, 12));
return Unsafe.As<Vector128<byte>, Guid>(ref result);
}
return Unsafe.As<Vector128<byte>, Guid>(ref vec);
}
public static Guid CreateVersion8()
{
return CreateVersion8(DateTimeOffset.UtcNow);
}
public static Guid CreateVersion8(in DateTimeOffset timestamp)
{
var mapperAb = EmptyMapperAb;
GetTicksAndSequence(timestamp, ref mapperAb);
mapperAb.B |= GuidVersion8;
Vector128<byte> vec = Vector128.Create(mapperAb.A, mapperAb.B, BaseMapperCd.C, BaseMapperCd.D).AsByte();
if (BitConverter.IsLittleEndian)
{
Vector128<byte> result = Vector128.Shuffle(vec, Vector128.Create((byte)0, 1, 2, 3, 6, 7, 4, 5, 11, 10, 9, 8, 15, 14, 13, 12));
return Unsafe.As<Vector128<byte>, Guid>(ref result);
}
return Unsafe.As<Vector128<byte>, Guid>(ref vec);
}
private static void GetTicksAndSequence(in DateTimeOffset timestamp, ref MapperAb mapperAb)
{
mapperAb.Ticks = timestamp.ToUnixTimeMilliseconds();
ArgumentOutOfRangeException.ThrowIfNegative(mapperAb.Ticks, nameof(timestamp));
var lockTaken = false;
_spinLock.Enter(ref lockTaken);
if (mapperAb.Ticks > _lastMilliseconds)
{
_sequence = 0;
_lastMilliseconds = mapperAb.Ticks;
}
else
{
if (_sequence == SequenceRollover) // rollover will happen, so we increase ticks
{
_sequence = 0;
++_lastMilliseconds;
}
mapperAb.Ticks = _lastMilliseconds;
}
uint b = _sequence++;
if (lockTaken) { _spinLock.Exit(); }
mapperAb.Ticks <<= 16;
mapperAb.B |= b;
}
[StructLayout(LayoutKind.Explicit)]
private struct MapperAb
{
[FieldOffset(0)] public long Ticks;
[FieldOffset(0)] public uint B;
[FieldOffset(sizeof(uint))] public uint A;
}
[StructLayout(LayoutKind.Explicit)]
private struct MapperCd
{
[FieldOffset(0)] public uint D;
[FieldOffset(0)] public ushort ProcessId;
[FieldOffset(sizeof(ushort))] public byte ThreadId;
[FieldOffset(sizeof(ushort) + sizeof(byte))] public byte TaskId;
[FieldOffset(sizeof(uint))] public uint C;
}
}
Update this are my current benchmark results.
This is the source code of the Benchmark
namespace Guid.Benchmarks;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Jobs;
using BenchmarkDotNetVisualizer;
using JetBrains.Annotations;
using Matching.Utils;
// ReSharper disable once ClassCanBeSealed.Global
[MaxColumn]
[MemoryDiagnoser(false)]
[MinColumn]
[RichHtmlExporter("Benchmark of UUID Creation",
["Job"],
["Mean", "Allocated"],
["Mean", "Min", "Max", "Allocated"],
dividerMode: RenderTableDividerMode.SeparateTables,
htmlWrapMode: HtmlDocumentWrapMode.RichDataTables)]
[ShortRunJob(RuntimeMoniker.Net90)]
[UsedImplicitly]
public class BenchmarkSequentialGuidCreation
{
[Benchmark]
public Guid BenchmarUuidV4() => Guid.NewGuid(); // UUIDv4
[Benchmark(Baseline = true)]
public Guid BenchmarkUuidV7() => Guid.CreateVersion7(); // UUIDv7
[Benchmark]
public Guid BenchmarkUuidV7Custom() => SequentialGuid.CreateVersion7(); // UUIDv7
[Benchmark]
public Guid BenchmarkUuidV8Custom() => SequentialGuid.CreateVersion8(); // UUIDv7
}
Update: I did have a look into the UUID v7 in .net9 that was mentioned by @Fildor. However, UUID v7 does pretty much the same I do but is significant slower than my implementation.
Update: I did have a look at the EF Core 8 implementation mentioned by @PanagiotisKanavos. This implementation uses not a real time component, instead it uses the time when the object is initialized as a base for all other Ids created. This can be problematic in a scenario where processes that use this class will be stopped and started again. There is also nothing else in place to ensure uniqueness.
Update: I did update the code with my final implementation. This version is Testable in unit tests and follows the implementation of Guid.CreateVersion7. I also create a CreateVersion8 that may be predictable but also is twice as fast.
Pass the ticks or even a DateTimeOffset as a parameter to NextGuid
instead of using Datetime.UtcNow.Ticks
internally. Create a parameterless overload for general use that uses Now
as the default timestamp
public static Guid NewGuid()=>NewGuid(DateTimeOffset.UtcNow);
public static Guid NewGuid(DateTimeOffset timestamp)
{
...
}
That's what .NET 9's Guid.CreateVersion7 does. You could even copy that code for use in .NET 8 or .NET 6 (for the 2 months of support it has), but note that adding v7 support required changes to Guid.Unix.cs.
public static Guid CreateVersion7() => CreateVersion7(DateTimeOffset.UtcNow);
The CreateVersion7(DateTimeOffset)
implementation uses a v4 GUID whose 48 MSBs are replaced by the Unix timestamp in milliseconds, thus producing a v7 sequential UUID. The comments even discuss the use of native GUID functions vs RandomNumberGenerator (it's slower).
/// <summary>Creates a new <see cref="Guid" /> according to RFC 9562, following the Version 7 format.</summary>
/// <param name="timestamp">The date time offset used to determine the Unix Epoch timestamp.</param>
/// <returns>A new <see cref="Guid" /> according to RFC 9562, following the Version 7 format.</returns>
/// <exception cref="ArgumentOutOfRangeException"><paramref name="timestamp" /> represents an offset prior to <see cref="DateTimeOffset.UnixEpoch" />.</exception>
/// <remarks>
/// <para>This seeds the rand_a and rand_b sub-fields with random data.</para>
/// </remarks>
public static Guid CreateVersion7(DateTimeOffset timestamp)
{
// NewGuid uses CoCreateGuid on Windows and Interop.GetCryptographicallySecureRandomBytes on Unix to get
// cryptographically-secure random bytes. We could use Interop.BCrypt.BCryptGenRandom to generate the random
// bytes on Windows, as is done in RandomNumberGenerator, but that's measurably slower than using CoCreateGuid.
// And while CoCreateGuid only generates 122 bits of randomness, the other 6 bits being for the version / variant
// fields, this method also needs those bits to be non-random, so we can just use NewGuid for efficiency.
Guid result = NewGuid();
// 2^48 is roughly 8925.5 years, which from the Unix Epoch means we won't
// overflow until around July of 10,895. So there isn't any need to handle
// it given that DateTimeOffset.MaxValue is December 31, 9999. However, we
// can't represent timestamps prior to the Unix Epoch since UUIDv7 explicitly
// stores a 48-bit unsigned value, so we do need to throw if one is passed in.
long unix_ts_ms = timestamp.ToUnixTimeMilliseconds();
ArgumentOutOfRangeException.ThrowIfNegative(unix_ts_ms, nameof(timestamp));
Unsafe.AsRef(in result._a) = (int)(unix_ts_ms >> 16);
Unsafe.AsRef(in result._b) = (short)(unix_ts_ms);
Unsafe.AsRef(in result._c) = (short)((result._c & ~VersionMask) | Version7Value);
Unsafe.AsRef(in result._d) = (byte)((result._d & ~Variant10xxMask) | Variant10xxValue);
return result;
}
The GuidTests.CreateVersion7 method uses this to ensure the values are the expected ones. It doesn't try to test for uniqueness, memory consumption.
[Fact]
public static void CreateVersion7()
{
DateTimeOffset timestamp = DateTimeOffset.UtcNow;
ReadOnlySpan<char> unix_ts_ms = timestamp.ToUnixTimeMilliseconds().ToString("x12", CultureInfo.InvariantCulture);
Guid guid = Guid.CreateVersion7(timestamp);
ReadOnlySpan<char> guidString = guid.ToString("", CultureInfo.InvariantCulture);
Assert.Equal(7, guid.Version);
Assert.True((guid.Variant == 0x8) || (guid.Variant == 0x9) || (guid.Variant == 0xA) || (guid.Variant == 0xB));
Assert.Equal(unix_ts_ms.Slice(0, 8), guidString.Slice(0, 8));
...
The unit test just checks that the bytes are correct. To ensure your algorithm works though you need to ensure that two threads won't produce the same value on the same machine. Uniqueness just doesn't exist - UUIDs aren't unique, they just have such a small chance of collision they're treated as unique. Sequential UUIDs have a higher but still very small chance of collision.
One way you can test your algorithm is to create eg 10, 20 or more tasks/threads that generate UUIDs and add them to a ConcurrentDictionary. If the algorithm is prone to race conditions, you'll get a duplicate key exception after a while.
var dict = new ConcurrentDictionary<Guid, Guid>();
Parallel.For(0, 1000000, new () { MaxDegreeOfParallelism = 32 }, _ =>
{
var guid = Guid.CreateVersion7();
if (!dict.TryAdd(guid, guid))
{
throw new InvalidOperationException("Duplicate GUID!");
}
});
Benchmarking
To test performance and allocation behavior, the defacto standard is Anderi Akinshin's BenckmarkDotNet. Instead of just running code multiple times and calculating an average, it runs tests long enough to gather statistically significant results. It will even warn if the results seem to vary too wildly. To account for warmup and cooldown effects, the tests will be run multiple times before it starts collecting measures.
This example benchmarks Guid.NewGuid()
vs Guid.CreateVersion7()
and EF8's SequentialGuidValueGenerator
, collecting memory statistics as well :
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
var summary = BenchmarkRunner.Run(typeof(Program).Assembly);
[MemoryDiagnoser]
public class GuidVsSequential
{
SequentialGuidValueGenerator _gen;
public GuidVsSequential()
{
_gen = new SequentialGuidValueGenerator();
}
[Benchmark(Baseline = true)]
public Guid V4() => Guid.NewGuid();
[Benchmark]
public Guid Sequential() => Guid.CreateVersion7();
[Benchmark]
public Guid EF8() => _gen.Next();
}
The result summary is this :
// * Summary *
BenchmarkDotNet v0.14.0, Windows 11 (10.0.22635.3936)
Intel Core i7-10850H CPU 2.70GHz, 1 CPU, 12 logical and 6 physical cores
.NET SDK 9.0.100-rc.1.24452.12
[Host] : .NET 9.0.0 (9.0.24.43107), X64 RyuJIT AVX2
DefaultJob : .NET 9.0.0 (9.0.24.43107), X64 RyuJIT AVX2
Method | Mean | Error | StdDev | Ratio | RatioSD | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|
V4 | 67.73 ns | 1.374 ns | 2.987 ns | 1.00 | 0.06 | - | NA |
Sequential | 101.57 ns | 2.030 ns | 2.711 ns | 1.50 | 0.08 | - | NA |
EF8 | 79.48 ns | 1.619 ns | 3.847 ns | 1.18 | 0.08 | - | NA |
Interestingly, the EF8 code is a big faster than Guid.CreateVersion7 but doesn't produce valid UUID v7 values, as it doesn't convert to a Unix timestamp and doesn't set the version bytes.