I have a large bytes object that I'd like to put in SharedMemory so that my multiprocessing tasks can access it. I'm using ShareableList as described in the docs.
from multiprocessing import shared_memory
s2v_a = Sense2Vec().from_disk(SENSE2VEC_FOLDER)
s2v_a_bytes = s2v_a.to_bytes()
print(sys.getsizeof(s2v_a_bytes)) #prints <class 'bytes'>
print(type(s2v_a_bytes)) #prints 4220733334 (4.2Gb)
memory = shared_memory.ShareableList([s2v_a_bytes])
However, when I try to create the ShareableList I get an AssertionError that the format is not less than 8. I can see that this is something to do with the struct packing format.
Traceback (most recent call last):
File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/runpy.py", line 193, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/user/dev/uniqueness/backend/app/coding.py", line 39, in <module>
memory = shared_memory.ShareableList([s2v_a_bytes])
File "/home/user/anaconda3/envs/uniqueness/lib/python3.8/multiprocessing/shared_memory.py", line 295, in __init__
assert sum(len(fmt) <= 8 for fmt in _formats) == self._list_len
AssertionError
Commment from the code
Because values are packed into a memoryview as bytes, the struct
packing format for any storable value must require no more than 8
characters to describe its format."""
But I've not done anything different to the docs, as far as I can see.
The docs state the maximum size for bytes
and str
in ShareableList
are 10M bytes. 4.2 GB is far over this limit.