I'm running a tensorflow model which is exhausting 60G of RAM in about 10 minutes while processing large images.
I've run Heapy to try to pin down a leak, but heapy shows only 90M of memory usage and remains constant.
I noted this article: Python process consuming increasing amounts of system memory, but heapy shows roughly constant usage
That suggested that the issue might be in python (2.7 here) with memory fragmentation. But that doesn't sound like a reasonable explanation for this case.
raw
queue using a thread. raw
queue, preprocess, and load it
into a ready
queue.ready
queue and run them through tensorflow training.So heapy is failing to see at least the 600M of memory that I know must be held at any given moment.
Hence, if heapy can't see the memory I know is there, I can't trust it to see where the leak is. At the rate it's leaking it's a virtual certainty that the batches of images are causing it.
I'm using the threading
module in python to kick off the loader and preprocessor threads. I've tried calling print h.heap()
from within the threads code and the main code, all with the same results.
I ended up having an unbounded python Queue
by accident. Simple fix. Weird that heapy didn't show memory that was allocated by the Queue. Well, memory_profiler
did, and thus I tracked down the issue.
It sure would have been a beautiful thing if heapy had said, "hey, there's this Queue
object using more memory than you were expecting."