Search code examples
pythonlistout-of-memorytor

How to convert list of lists of ints into a list of bytes without running out of memory in python3?


The background is a bit complicated. So we have a (for testing purposes, could be bigger in production) 1.2GB .zip file we are splitting into chunks and sending through multiple Onionshare instances. Once the receiver receives all the chunks, it places them into a list, called content. Since each chunk is a list of ints (not sure why onionshare transfers as ints rather than bytes), content is a list of lists of ints. However, to save it down properly we need to (as far as I know) convert it to a list of bytes and save using b''.join Here is the code we have:

#Write total content to image.zip
#content is a list of lists of int, but saved value must be a simple list of bytes
#This is what I came up with to convert it to a list of bytes
content2 = []
for i in range (0, threads):
    for j in range(0, len(content[i])):
        content2.append(bytes(content[i][j]))
#And now it can be saved with a join
open("image.zip", "wb").write(b''.join(content2))    

I'm pretty sure this should work, however doing this we are running into an issue where the process is killed due to using too much memory. We tried writing the file in increments rather than all at once, but the same issue occurred. So I think the memory issue is happening during the actual conversion. Can anyone suggest a better way to do this?

Thanks


Solution

  • You don't need to convert the entire collection of data at once. Just convert and write each chunk of data individually:

    with open("image.zip", "wb") as f:
      for i in range(threads):
        for chunk in content[i]:
          f.write(bytes(chunk))
    

    The only memory overhead here is a copy of each individual chunk.