Search code examples
c#json.netbase64out-of-memory

Json.Net deserialize out of memory issue


I got a Json, which contains among others a data field which stores a base64 encoded string. This Json is serialized and send to a client.

On client side, the newtonsoft json.net deserializer is used to get back the Json. However, if the data field becomes large (~ 400 MB), the deserializer will throw an out of memory exception: Array Dimensions exceeded supported Range. I also see in Task-Manager, that memory consumption really grows fast.

Any ideas why this is? Is there a maximum size for json fields or something?

Code example (simplified):

HttpResponseMessage responseTemp = null;
responseTemp = client.PostAsJsonAsync(client.BaseAddress, message).Result;

string jsonContent = responseTemp.Content.ReadAsStringAsync.Result;
result = JsonConvert.DeserializeObject<Result>(jsonContent);

Result class:

public class Result
{

    public string Message { get; set; }
    public byte[] Data { get; set; }

}

UPDATE:

I think my problem is not the serializer, but just trying to handle such a huge string in memory. At the point where I read the string into memory, the memory consumption of the application explodes. Every operation on that string does the same. At the moment, I think I have to find a way to work with streams and stop reading the whole stuff into memory at once.


Solution

  • You have two problems here:

    1. You have a single Base64 data field inside your JSON response that is larger than ~400 MB.

    2. You are loading the entire response into an intermediate string jsonContent that is even larger since it embeds the single data field.

    Firstly, I assume you are using 64 bit. If not, switch.

    Unfortunately, the first problem can only be ameliorated and not fixed because Json.NET's JsonTextReader does not have the ability to read a single string value in "chunks" in the same way as XmlReader.ReadValueChunk(). It will always fully materialize each atomic string value. But .Net 4.5 adds the following settings that may help:

    1. <gcAllowVeryLargeObjects enabled="true" />.

      This setting allows for arrays with up to int.MaxValue entries even if that would cause the underlying memory buffer to be larger than 2 GB. You will still be unable to read a single JSON token of more than 2^31 characters in length, however, since JsonTextReader buffers the full contents of each single token in a private char[] _chars; array, and, in .Net, an array can only hold up to int.MaxValue items.

    2. GCSettings.LargeObjectHeapCompactionMode = GCLargeObjectHeapCompactionMode.CompactOnce.

      This setting allows the large object heap to be compacted and may reduce out-of-memory errors due to address space fragmentation.

    The second problem, however, can be addressed by streaming deserialization, as shown in this answer to this question by Dilip0165; Efficient api calls with HttpClient and JSON.NET by John Thiriet; Performance Tips: Optimize Memory Usage by Newtonsoft; and Streaming with New .NET HttpClient and HttpCompletionOption.ResponseHeadersRead by Tugberk Ugurlu. Pulling together the information from these sources, your code should look something like:

    Result result;
    var requestJson = JsonConvert.SerializeObject(message); // Here we assume the request JSON is not too large
    using (var requestContent = new StringContent(requestJson, Encoding.UTF8, "application/json"))
    using (var request = new HttpRequestMessage(HttpMethod.Post, client.BaseAddress) { Content = requestContent })
    using (var response = client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).Result)
    using (var responseStream = response.Content.ReadAsStreamAsync().Result)
    {
        if (response.IsSuccessStatusCode)
        {
            using (var textReader = new StreamReader(responseStream))
            using (var jsonReader = new JsonTextReader(textReader))
            {
                result = JsonSerializer.CreateDefault().Deserialize<Result>(jsonReader);
            }
        }
        else
        {
            // TODO: handle an unsuccessful response somehow, e.g. by throwing an exception
        }
    }
    

    Or, using async/await:

    Result result;
    var requestJson = JsonConvert.SerializeObject(message); // Here we assume the request JSON is not too large
    using (var requestContent = new StringContent(requestJson, Encoding.UTF8, "application/json"))
    using (var request = new HttpRequestMessage(HttpMethod.Post, client.BaseAddress) { Content = requestContent })
    using (var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead))
    using (var responseStream = await response.Content.ReadAsStreamAsync())
    {
        if (response.IsSuccessStatusCode)
        {
            using (var textReader = new StreamReader(responseStream))
            using (var jsonReader = new JsonTextReader(textReader))
            {
                result = JsonSerializer.CreateDefault().Deserialize<Result>(jsonReader);
            }
        }
        else
        {
            // TODO: handle an unsuccessful response somehow, e.g. by throwing an exception
        }
    }           
    

    My code above isn't fully tested, and error and cancellation handling need to be implemented. You may also need to set the timeout as shown here and here. Json.NET's JsonSerializer does not support async deserialization, making it a slightly awkward fit with the asynchronous programming model of HttpClient.

    Finally, as an alternative to using Json.NET to read a huge Base64 chunk from a JSON file, you could use the reader returned by JsonReaderWriterFactory which does support reading Base64 data in manageable chunks. For details, see this answer to Parse huge OData JSON by streaming certain sections of the json to avoid LOH for an explanation of how stream through a huge JSON file using this reader, and this answer to Read stream from XmlReader, base64 decode it and write result to file for how to decode Base64 data in chunks using XmlReader.ReadElementContentAsBase64