I'm handling a long-running process that does a lot of copying to a memory stream so it can create an archive and upload it to azure. In excess of 2 GB in size. The process is throwing an Out Of Memory exception at some point, and I'm trying to figure out what the upper limit is to prevent that, stop the process, upload a partial archive, and pick up where it left off.
But I'm not sure how to determine how much memory the stream is actually using. Right now, I'm using MemoryStream.Length
, but I don't think that's the right approach.
So, how can I determine how much memory a MemoryStream is using in VB.Net?
In compliance with the Minimal, Reproducible Example requirements -
Sub Main(args As String())
Dim stream = New MemoryStream
stream.WriteByte(10)
'What would I need to put here to determine the size in memory of stream?
End Sub
This is a known issue with the garbage collector for .Net Framework (I've heard .Net Core may have addressed this, but I haven't done a deep dive myself to confirm it).
What happens is you have an item with an internal (growing) buffer like a string, StringBuilder, List, MemoryStream, etc. As you write to the object, the buffer fills and at some point needs to be replaced. So behind the scenes a new buffer is allocated, and data is copied from the old to the new. At this point the old buffer becomes eligible for garbage collection.
Already we see one large inefficiency: copying from the old buffer to the new can be a significant cost as the the object grows. Therefore, if you have an idea of your item's final size up front, you probably want to use a mechanism that will let you set that initial size. With a List<T>
, for example, this means using the constructor overload to include a Capacity
... if you know it. This means the object never (or rarely) has to copy the data between buffers.
But we're not done yet. When the copy operation is finished, the garbage collector WILL appropriately reclaim the memory and return it to the operating system. However, there's more involved than just memory. There's also the virtual address space for your application's process. The virtual address space formerly occupied by that memory is NOT immediately released. It can be reclaimed through a process called "compaction", but there are situations where this just... doesn't happen. Most notably, memory on the Large Object Heap (LOH) is rarely-to-never compacted. Once your item eclipses a magic 85,000 byte threshold for the LOH, you're starting to build up holes in the virtual address space. Fill these buffers enough, and the virtual address table runs out of room, resulting in (drumroll please)... an OutOfMemoryException
. This is the exception thrown, even though there may be plenty of memory available.
Since most of the items that create this scenario use a doubling algorithm for the underlying buffer, it's possible to generate these exceptions without using all that much memory — just over the square root of the actual possible virtual address space, though that's worst-case and commonly you'll get somewhat higher first.
To avoid this, be careful in how you use items with buffers that grow dynamically: set the full capacity up front when you can, or stream to a more-robust backing store (like a file) when you can't.