Search code examples
pythonarrayspython-3.xpython-multiprocessing

Sharing bytearray between processes with multiprocessing manager


I am trying to share a bytearray type variable between processes. However, I couldn't see that there is bytearray type in the multiprocessing.manager class. What are the other options to share bytearray variable between processes? I know that I can convert it to string and pass it to Manager.value("","") but it is not an efficient way to do it.


Solution

  • I have not looked into bytearray specifically, so maybe there's another, more efficient approach to this, but to create a managed object of ANY datatype you want, as long as it's attributes are all picklable, you can follow this sample (just substitute bytearray with the class who's object you want to create a manager for):

    from multiprocessing.managers import NamespaceProxy, BaseManager
    import inspect
    
    
    class ObjProxy(NamespaceProxy):
        """Returns a proxy instance for any user defined data-type. The proxy instance will have the namespace and
        functions of the data-type (except private/protected callables/attributes). Furthermore, the proxy will be
        pickable and can its state can be shared among different processes. """
    
        @classmethod
        def populate_obj_attributes(cls, real_cls):
            DISALLOWED = set(dir(cls))
            ALLOWED = ['__sizeof__', '__eq__', '__ne__', '__le__', '__repr__', '__dict__', '__lt__',
                       '__gt__']
            DISALLOWED.add('__class__')
            new_dict = {}
            for (attr, value) in inspect.getmembers(real_cls, callable):
                if attr not in DISALLOWED or attr in ALLOWED:
                    new_dict[attr] = cls._proxy_wrap(attr)
            return new_dict
    
        @staticmethod
        def _proxy_wrap(attr):
            """ This method creates function that calls the proxified object's method."""
    
            def f(self, *args, **kwargs):
                return self._callmethod(attr, args, kwargs)
    
            return f
    
    attributes = ObjProxy.populate_obj_attributes(bytearray)
    bytearrayProxy = type("bytearrayProxy", (ObjProxy,), attributes)
    
    
    
    if __name__ == "__main__":
        BaseManager.register('bytearray', bytearray, bytearrayProxy, exposed=tuple(dir(bytearrayProxy)))
        manager = BaseManager()
        manager.start()
    
    
        a = [2, 3, 4, 5, 7]
        obj = manager.bytearray(a)
        print(obj)
    

    Output

    bytearray(b'\x02\x03\x04\x05\x07')
    

    It creates a near perfect proxy for the object of the datatype of you want (in this case it is bytearray) by copying all public attributes and methods, including the special dunder methods if the class uses any. You can now pass obj to any other process, and the data stored inside this object will be synchronized across all processes.

    If you want more details as to how this proxy works, I wrote a detailed answer here