Kedro - Memory management

I am working on a Kedro 0.17.2 project that is running on out-of-memory issues and I'm trying to reduce the memory footprint.

I'm doing the profiling by using mprof from the memory-profiler library and I noticed that there is always a child process and memory seems to duplicate in the main process after the first computation in the node that is running. Is it possible that Kedro is duplicating the dataframes in memory? And, if so, is there a way to avoid this?

Notes:

I'm using the SequentialRunner
I'm not using the is_async cli option
I'm not using either multithreading or multiprocessing in the node execution

Solution

It turns out this issue is caused by a possible bug in the memory-profiler library that is used in the kedro.extras.decorators.memory_profiler.mem_profile decorator.

The kedro decorator makes use of the memory_usage function in the memory-profiler module. It is used to sample the total memory being used by the running function from within the python process.

There is an open issue about this problem but with no solution yet. https://github.com/pythonprofilers/memory_profiler/issues/332

For the moment I have just removed the decorator.