Search code examples
c#.netwpfwcfreflection

System.OutOfMemoryException .NET 4.8 WPF App


I'm working on a WPF (WCF architecture) app. This app has 2 main solutions, the client(front end) and the controller(back end), the way the client communicates with the back end is through reflective calls, in essences it builds up a XML doc containing info like the method to execute, the namespace that the method is in and method args etc...

So this specific method is to upload documents. It works perfectly until the file is > 1GB in size. The error occurs when the client builds up the XML doc for the reflective call.

We add to the XML doc like this:

result.Add(new XElement(itemProperty.Name, new XCData(data)));

In this case the itemProperty.Name is the method name and the XCData is the method args. So the doc to upload will obvisolsy be the arguments, we receive it in a byte[]. And we need to pass it to the XCData constructor as a string, so the following is used:

string data= Convert.ToBase64String((byte[])value)

Note that this works with smaller files but with the 1GB file a 'System.OutOfMemoryException' exception is thrown when trying to convert the byte[] to base64String.

I have tried reading the array in chunks, but when calling the stringBuilder.ToString(), the same exception is thrown...

 public static string HandleBigByteArray(byte[] arr)
        {
            try
            {
                
                #region approach 2
                int chunkSize = 104857600;//100MB
                StringBuilder b64StringBuilder = new StringBuilder();
                int offset = 0;

                while (offset < arr.Length)
                {
                    int remainingBytes = arr.Length - offset;
                    int bytesToEncode = Math.Min(chunkSize, remainingBytes);
                    string base64Chunk = Convert.ToBase64String(arr, offset, bytesToEncode);
                    b64StringBuilder.Append(base64Chunk);
                    offset += bytesToEncode;
                }


                return b64StringBuilder.ToString();
                #endregion
            }
            catch (Exception)
            {
                throw;
            }
        }

I have no idea what to do or how to further debug/approach this..


Solution

  • Your basic problem here is that, in constructing your Base64 string, when you do b64StringBuilder.ToString() you are trying to exceed the maximum possible .NET string length on your system, which is, as explained in this answer by HitScan to What is the maximum possible length of a .NET string?, at most int.MaxValue / 2 characters on a 64-bit system.

    To break that down, the largest contiguous memory block you can allocate on .NET Framework is int.MaxValue bytes, or 2GB. A char takes 2 bytes, and Base64 encoding inflates the character count by 33%, so the maximum size byte array you could encode is around 3/4 GB -- which is exactly what you are seeing.

    (Note that if you set gcAllowVeryLargeObjects, you will be able to allocate arrays of up to 4GB in memory, so multiply the above calculation by a factor of 2 if you do.)

    To solve this particular problem, rather than creating a single XCData containing the entire Base64 contents of the byte [] array, you could create a sequence of XText containing partial chunks of bounded size and add them all to the itemProperty.Name element. They will be formatted as a single contiguous text value when written to XML.

    To do this, first introduce the following extension methods:

    public static partial class XmlExtensions
    {
        const int DefaultChunkLength = 8000;
        const int Base64BytesPerChunk = 3; // Base64 encodes 3 bytes to 4 characters.
        
        public static IEnumerable<XText> ToBase64XTextChunks(this Stream stream, int chunkLength = DefaultChunkLength)
        {
            return stream.ToBase64StringChunks(chunkLength).Select(s => new XText(s));
        }
    
        public static IEnumerable<XText> ToBase64XTextChunks(this IEnumerable<byte []> chunks, int chunkLength = DefaultChunkLength)
        {
            return chunks.Select(b => new ArraySegment<byte>(b)).ToBase64XTextChunks(chunkLength);
        }
    
        // In .NET Core I would use Memory<T> and/or ReadOnlyMemory<T> instead of ArraySegment<T>.
    
        public static IEnumerable<XText> ToBase64XTextChunks(this IEnumerable<ArraySegment<byte>> chunks, int chunkLength = DefaultChunkLength)
        {
            return chunks.ToBase64StringChunks(chunkLength).Select(s => new XText(s));
        }
    
        internal static IEnumerable<string> ToBase64StringChunks(this Stream stream, int chunkLength = DefaultChunkLength)
        {
            if (stream == null)
                throw new ArgumentNullException("stream");
            if (chunkLength < 1 || chunkLength > int.MaxValue / Base64BytesPerChunk)
                throw new ArgumentOutOfRangeException("chunkLength < 1 || chunkLength > int.MaxValue / Base64BytesPerChunk");
            var buffer = new byte[Math.Max(300, Base64BytesPerChunk * DefaultChunkLength)];
            return ToBase64StringChunksEnumerator(stream.ReadAllByteChunks(buffer), chunkLength);
        }
        
        internal static IEnumerable<string> ToBase64StringChunks(this IEnumerable<ArraySegment<byte>> chunks, int chunkLength = DefaultChunkLength)
        {
            if (chunks == null)
                throw new ArgumentNullException("chunks");
            if (chunkLength < 1 || chunkLength > int.MaxValue / 3)
                throw new ArgumentOutOfRangeException("chunkLength < 1 || chunkLength > int.MaxValue / 3");
            return ToBase64StringChunksEnumerator(chunks, chunkLength);
        }
        
        static IEnumerable<string> ToBase64StringChunksEnumerator(this IEnumerable<ArraySegment<byte>> chunks, int chunkLength)
        {
            var buffer = new byte[Base64BytesPerChunk*chunkLength];
            foreach (var chunk in chunks.ToFixedSizedChunks(buffer))
            {
                yield return Convert.ToBase64String(chunk.Array, chunk.Offset, chunk.Count);
            }
        }
        
        internal static IEnumerable<ArraySegment<byte>> ReadAllByteChunks(this Stream stream, byte [] buffer)
        {
            if (stream == null)
                throw new ArgumentNullException("stream");
            if (buffer == null)
                throw new ArgumentNullException("buffer");
            if (buffer.Length < 1)
                throw new ArgumentException("buffer.Length < 1");
            return ReadAllByteChunksEnumerator(stream, buffer);
        }
    
        static IEnumerable<ArraySegment<byte>> ReadAllByteChunksEnumerator(Stream stream, byte [] buffer)
        {
            int nRead;
            while ((nRead = stream.Read(buffer, 0, buffer.Length)) > 0)
                yield return new ArraySegment<byte>(buffer, 0, nRead);
        }
    }
    
    public static partial class EnumerableExtensions
    {
        public static IEnumerable<ArraySegment<T>> ToFixedSizedChunks<T>(this IEnumerable<ArraySegment<T>> chunks, T [] buffer)
        {
            if (chunks == null)
                throw new ArgumentNullException("chunks");
            if (buffer.Length == 0)
                throw new ArgumentException("buffer.Length == 0");
            return ToFixedSizedChunksEnumerator(chunks, buffer);
        }
        
        static IEnumerable<ArraySegment<T>> ToFixedSizedChunksEnumerator<T>(IEnumerable<ArraySegment<T>> chunks, T [] buffer) 
        {
            int bufferIndex = 0;
            bool anyRead = false, anyReturned = false;
            foreach (var chunk in chunks)
            {
                anyRead = true;
                int chunkIndex = 0;
                while (chunkIndex < chunk.Count)
                {
                    int toCopy = Math.Min(buffer.Length - bufferIndex, chunk.Count - chunkIndex);
                    if (toCopy > 0)
                    {
                        chunk.CopyTo(chunkIndex, buffer, bufferIndex, toCopy);
                        bufferIndex += toCopy;
                        if (bufferIndex == buffer.Length)
                        {
                            yield return new ArraySegment<T>(buffer, 0, bufferIndex);
                            bufferIndex = 0;
                            anyReturned = true;
                        }
                    }
                    chunkIndex += toCopy;
                }
            }
            // If passed an enumerable of empty chunks we should still return one empty chunk.  But if there were no chunks at all, return nothing.
            if (bufferIndex > 0 || (anyRead && !anyReturned))
                yield return new ArraySegment<T>(buffer, 0, bufferIndex);
        }
        
        public static void CopyTo<T>(this ArraySegment<T> from, int fromIndex, T [] destination, int destinationIndex, int count)
        {
            Buffer.BlockCopy(from.Array, checked(from.Offset + fromIndex), destination, destinationIndex, count);
        }
    }
    

    And now, assuming your byte[] arr value was actually read from a file fileName, you can do:

    var result = new XElement("root");
    
    var e = new XElement(itemProperty.Name);
    using (var stream = File.OpenRead(fileName))
    {
        foreach (var text in stream.ToBase64XTextChunks())
            e.Add(text);
    }
    result.Add(e);
    

    The extension method stream.ToBase64XTextChunks() reads the stream in chunks and encodes in chunks so you never hit the maximum array size or the maximum string length.

    But if you already have your huge byte [] arr array in memory, you can do:

        foreach (var text in new [] { arr }.ToBase64XTextChunks())
            e.Add(text);
    

    Notes

    • I recommend, for performance reasons, keeping your buffer and string allocations smaller than 80,000 bytes so that they don't go on the large object heap.

    • In .NET Core I would use Memory<T> and ReadOnlyMemory<T> instead of the older ArraySegment<T>.

    Demo fiddle here.

    That being said, you are very close to hitting other memory limits as well, including:

    • The maximum byte [] array size, which is int.MaxValue, i.e. 2 GB.

      Setting gcAllowVeryLargeObjects does not increase this limit. gcAllowVeryLargeObjects only increases the amount of memory an array can hold. The length is still limited to int.MaxValue.

      This limit will be a problem on both the client and server sides.

    • The maximum number of characters that can be held by a StringBuilder, which can be seen from the reference source to be int.MaxValue.

    • Your server's total available virtual memory. Your current design does not seem to limit upload sizes at all. That appears to make your server vulnerable to denial-of-service attacks where the attacker continues to upload data until your server runs out of memory.

      And if you have a large number of clients that upload huge chunks of data at the same time, you server will again run out of memory even if none of the clients is attempting a DOS attack.

    • Your client's available virtual memory. If your clients are running in resource-restricted environments (such as smartphones) huge memory allocations may not be possible.

    I would suggest that you rethink your design of allowing arbitrarily large file uploads, and buffering them in memory on both the client and server sides. Even if you decide that, for business reasons, you need to support upload of files larger than 1 or 2 GB, I recommend that you adopt a streaming solution where the contents are written and read using XmlWriter and XmlReader without ever loading the entire contents into memory. To get started, see