Search code examples
pythonpython-3.xnumpyshared-memorysysv

Check if Numpy Array is Stored in Shared Memory


In Python 3.8+, is it possible to check whether a numpy array is being stored in shared memory?

In the following example, a numpy array sharedArr was created using the buffer of a multiprocessing.shared_memory.SharedMemory object. Will like to know if we can write a function that can detect whether SharedMemory is used.

import numpy as np
from multiprocessing import shared_memory

if __name__ == '__main__':
    # Created numpy array `sharedArr`in shared memory
    arr = np.zeros(5)
    shm = shared_memory.SharedMemory(create=True, size=arr.nbytes)
    sharedArr = np.ndarray(arr.shape, dtype=arr.dtype, buffer=shm.buf)
    sharedArr[:] = arr[:]

    # How to tell if numpy array is stored in shared memory?
    print(type(sharedArr))      # <class 'numpy.ndarray'>
    print(hex(id(sharedArr)))   # 0x7fac99469f30

    shm.close()
    shm.unlink()

Solution

  • In this particular case, you can use the base attribute of the shared array. The attribute is a reference to the underlying object from which this array derives its memory. This is None for most arrays, to indicate that such an array owns its data. Running this code on my machine indicates that this array's base is a mmap object:

    >>> sharedArr.base
    <mmap.mmap at 0x11a4aa670>
    

    If you still have a reference to the shared memory object from which the array was allocated, you can compare the array's base to the shared memory segment's memory map:

    >>> sharedArr.base is shm._mmap
    True
    

    If you don't have the shm object lying around, as you wouldn't in a standalone function which could hypothetically perform this task, I doubt there's a portable and foolproof way to do it.

    Since NumPy provides its own memory-map object, it may suffice for your case to do the former check. That is, make the assumption that if the array is backed by a vanilla, builtin Python memory map, it is allocated from shared memory:

    import mmap
    
    def array_is_from_shared_memory(arr):
        return isinstance(arr.base, mmap.mmap)
    

    This works in your particular example, but you'd have to be careful with it, clearly document the assumptions that it makes, and test that it provides you with the actual information you need in your exact application.