I wrote my own BigIntegerConverter for JSON serialization/deserialization (.Net System.Text.Json) In the Read method I checked if ValueSequence is used
...
string stringValue;
if (reader.HasValueSequence)
{
stringValue = Encoding.UTF8.GetString(reader.ValueSequence);
}
else
{
stringValue = Encoding.UTF8.GetString(reader.ValueSpan);
}
if (BigInteger.TryParse(stringValue, CultureInfo.InvariantCulture, out var result))
{
return result;
}
...
Now I want to test that code, but I am only able to get to the else tree so far. Based on the documentation I assumed that ValueSequence will be used if the data got big enough.
However, I already testing with BigIntegers as big as BigInteger.Pow(new BigInteger(long.MaxValue), 1234);
and still cannot get the ValueSequence to be used.
Did I missunderstood something? Is there a way to enforce the use of ValueSqeuence for test purposes?
My testcase looks like this
[Theory]
[MemberData(nameof(GetNotNullTestData))]
public void Read_EntityWithNotNullableBigInteger(string name, BigInteger expected, string value)
{
// Arrange
var json = $$"""{"Name":"{{name}}","NotNullableValue":{{value}}}""";
// Act
var result = JsonSerializer.Deserialize<NotNullableBigIntegerEntity>(json, _options);
// Assert
Assert.NotNull(result);
Assert.Equal(name, result.Name);
Assert.Equal(expected, result.NotNullableValue);
}
Regards Michael
Utf8JsonReader
has a constructor that takes a ReadOnlySequence<byte>
, so you could take your JSON string, encode it to a UTF8 byte array, break that into small chunks, then convert that sequence of chunks into a ReadOnlySequence<byte>
using ReadOnlySequenceFactory
from this answer to Deserialize very large json from a chunked array of strings using system.text.json.
First, introduce the following factory class:
// From this answer https://stackoverflow.com/a/61087772 to https://stackoverflow.com/questions/61079767/deserialize-very-large-json-from-a-chunked-array-of-strings-using-system-text-js
public static class ReadOnlySequenceFactory
{
public static ReadOnlySequence<T> AsSequence<T>(this IEnumerable<T []> buffers) => ReadOnlyMemorySegment<T>.Create(buffers.Select(a => new ReadOnlyMemory<T>(a)));
public static ReadOnlySequence<T> AsSequence<T>(this IEnumerable<ReadOnlyMemory<T>> buffers) => ReadOnlyMemorySegment<T>.Create(buffers);
// There is no public concrete implementation of ReadOnlySequenceSegment<T> so we must create one ourselves.
// This is modeled on https://github.com/dotnet/runtime/blob/v5.0.18/src/libraries/System.Text.Json/tests/BufferFactory.cs
// by https://github.com/ahsonkhan
class ReadOnlyMemorySegment<T> : ReadOnlySequenceSegment<T>
{
public static ReadOnlySequence<T> Create(IEnumerable<ReadOnlyMemory<T>> buffers)
{
ReadOnlyMemorySegment<T>? first = null;
ReadOnlyMemorySegment<T>? current = null;
foreach (var buffer in buffers)
{
var next = new ReadOnlyMemorySegment<T> { Memory = buffer };
if (first == null)
first = next;
else
{
current!.Next = next;
next.RunningIndex = current.RunningIndex + current.Memory.Length;
}
current = next;
}
if (first == null)
first = current = new ();
return new ReadOnlySequence<T>(first, 0, current!, current!.Memory.Length);
}
}
}
Then write your converter e.g. as follows:
public class BigIntegerConverter : JsonConverter<BigInteger>
{
// The actual implementation seems to be in INumberBase<TSelf>.TryParse() so I had to do this to call the method:
static bool TryParse<TSelf>(ReadOnlySpan<byte> utf8Text, out TSelf? value) where TSelf : IUtf8SpanParsable<TSelf>, new() =>
TSelf.TryParse(utf8Text, NumberFormatInfo.InvariantInfo, out value);
public override BigInteger Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
{
if (reader.TokenType != JsonTokenType.Number)
throw new JsonException(string.Format("Found token {0} but expected token {1}", reader.TokenType, JsonTokenType.Number ));
var utf8Text = reader.HasValueSequence ? reader.ValueSequence.ToArray() : reader.ValueSpan;
if (TryParse<BigInteger>(utf8Text, out var value))
return value;
throw new JsonException();
}
public override void Write(Utf8JsonWriter writer, BigInteger value, JsonSerializerOptions options) =>
writer.WriteRawValue(value.ToString(NumberFormatInfo.InvariantInfo), false);
}
Now you will be able to write your test method as follows, ensuring your input JSON is broken into small chunks:
public record NotNullableBigIntegerEntity(string Name, BigInteger NotNullableValue);
JsonSerializerOptions _options = new()
{
Converters = { new BigIntegerConverter() },
};
public void Read_EntityWithNotNullableBigInteger(string name, BigInteger expected, string value)
{
int byteChunkSize = 3;
// Arrange
var json = $$"""{"Name":"{{name}}","NotNullableValue":{{value}}}""";
// Break into chunks
var utf8json = Encoding.UTF8.GetBytes(json);
var sequence = utf8json.Chunk(byteChunkSize).AsSequence();
// Act
var reader = new Utf8JsonReader(sequence);
var result = JsonSerializer.Deserialize<NotNullableBigIntegerEntity>(ref reader, _options);
// Assert
Assert.NotNull(result);
Assert.Equal(name, result?.Name);
Assert.Equal(expected, result?.NotNullableValue);
}
Notes:
BigInteger
implements IUtf8SpanParsable<BigInteger>
which allows direct parsing from UTF8 encoded byte spans. As such there's no need to call Encoding.UTF8.GetString()
to construct a UTF16 string corresponding to the current value.
Based on the documentation I assumed that ValueSequence will be used if the data got big enough. -- maybe, maybe not. MSFT seems to have intended ReadOnlySequence<byte>
to be used when deserializing asynchronously from some request or response stream, but you are deserializing synchronously from an in-memory string. Whether MSFT chooses to break that string down into a single UTF8 byte span or multiple UTF8 byte sequences is an implementation detail which they do not make public.
(When I check the reference source, the current code seems to convert the incoming string to a single byte span.)
Demo fiddle here.