I have 2 GB of data in memory (for example data = b'ab' * 1000000000
) that I would like to write in a encrypted ZIP or 7Z file.
How to do this without writing data
to a temporary on-disk file?
Is it possible with only Python built-in tools (+ optionally 7z)?
I've already looked at this:
ZipFile.writestr
writes from a in-memory string/bytes which is good but:
ZipFile.setpassword
: only for read, and not write
How to create an encrypted ZIP file? : most answers use a file as input (and cannot work with in-memory data), especially the solutions with pyminizip
and those with:
subprocess.call(['7z', 'a', '-mem=AES256', '-pP4$$W0rd', '-y', 'myarchive.zip']...
Other solutions require to trust an implementation of cryptography by a third-party tool (see comments), so I would like to avoid them.
7z.exe has the -si
flag, which lets it read data for a file from stdin. This way you could still use 7z's commandline from a subprocess
even without an extra file:
from subprocess import Popen, PIPE
# inputs
szip_exe = r"C:\Program Files\7-Zip\7z.exe" # ... get from registry maybe
memfiles = {"data.dat" : b'ab' * 1000000000}
arch_filename = "myarchive.zip"
arch_password = "Swordfish"
for filename, data in memfiles.items():
args = [szip_exe, "a", "-mem=AES256", "-y", "-p{}".format(arch_password),
"-si{}".format(filename), output_filename]
proc = Popen(args, stdin=PIPE, stdout=PIPE, stderr=PIPE)
proc.stdin.write(data)
proc.stdin.close() # causes 7z to terminate
# proc.stdin.flush() # instead of close() when on Mac, see comments
proc.communicate() # wait till it actually has
The write()
takes somewhat above 40 seconds on my machine, which is quite a lot. I can't say though if that's due to any inefficiencies from piping the whole data through stdin or if it's just how long compressing and encrypting a 2GB file takes. EDIT: Packing the file from HDD took 47 seconds on my machine, which speaks for the latter.