Search code examples
pythonmachine-learningmemory-leaksamazon-sagemakermxnet

How to find memory leak in Python MXNet?


I am afraid that my Neural Network in MXNet, written in Python, has a memory leak. I have tried the MXNet profiler and the tracemalloc module to get an understanding of memory profiling, but I want to get information on any potential memory leaks, just like I'd do with valgrind in C.

I found Detecting Memory Leaks and Buffer Overflows in MXNet, and after managing to build like described in section "Using ASAN builds with MXNet", by replacing the "ubuntu_cpu" part in docker/Dockerfile.build.ubuntu_cpu -t mxnetci/build.ubuntu_cpu with "ubuntu_cpu_python", I tried executing in an AWS Sagemaker Notebook like this:

root@33e38e00f825:/work/mxnet# nosetests3 --verbose /home/ec2-user/SageMaker/run_predict.py

and I get this import error:

Failure: ImportError (No module named 'run_predict') ... ERROR

My run_predict.py looks like this:

#!/usr/bin/env python
def run_predict(n):
  # calling MXNet inference method

run_predict(-1)  # tried it putting it under 'if __name__ == "__main__":'

What I am missing in my script, what should I change?

The example script they use in the link is rnn_test.py, but even when I run this example, I still get an analogous Import Error.


Solution

  • In MXNet, we automatically test for this through examining the garbage collection records. You can find how it's implemented here: https://github.com/apache/incubator-mxnet/blob/c3aff732371d6177e5d522c052fb7258978d8ce4/tests/python/conftest.py#L26-L79