Search code examples
rapidscudf

How do you determine memory stats while using rapids.ai?


I'm using python libraries of rapids.ai and one of the key things I'm starting to wonder is: how do I inspect memory allocation programatically? I know I can use nvidia-smi to look at some overall high level stats, but specifically I woud like to know:

1) Is there an easy way to find the memory footprint of a cudf dataframe (and other rapids objects?)

2) Is there a way for me to determine device memory available?

I'm sure there are plenty of ways for a C++ programmer to get these details but I'm hoping to find an answer that allows me to stay in Python.


Solution

  • 1) Usage

    All cudf objects should have the .memory_usage() method:

    import cudf
    x = cudf.DataFrame({'x': [1, 2, 3]})
    x_usage = x.memory_usage(deep=True)
    print(x_usage)
    

    Out:

    x        24
    Index     0
    dtype: int64
    

    These values reflect GPU memory used.

    2) Remaining

    You can read the remaining available GPU memory with pynvml:

    import pynvml
    ​
    pynvml.nvmlInit()
    handle = pynvml.nvmlDeviceGetHandleByIndex(0) # Need to specify GPU
    mem = pynvml.nvmlDeviceGetMemoryInfo(handle)
    mem.free, mem.used, mem.total
    (33500299264, 557973504, 34058272768)
    

    Most GPU operations require a scratch buffer that is O(N), so you may run into RMM_OUT_OF_MEMORY errors if you end up with DataFrames or Series that are larger than your remaining available memory.