Search code examples
pythonexcelbashunixxlsx

Add a password (unattended) to existing xlsx without Windows exclusive tools


I'm generating an xlsx file using Openpyxl. And i'd like to protect the workbook itself using a password that I have as a variable in the same script. This can be set manually using File > Passwords.. and setting "Password to open" in Excel itself.

Openpyxl only seems to offer sheet based edit protection through ws.protection.set_password("mypassword") (where ws is an open worksheet)

I can't seem to find the exact examples but somewhere I read that xlsx files were basically zip archives, and while it seemed true when I ran commands like unzip -t and 7z x it seems that adding a password using utilities like 7z or zipcloak completely breaks the file when it's put back together.

 % 7z x ../sample.xlsx .

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=utf8,Utf16=on,HugeFiles=on,64 bits,4 CPUs x64)

Scanning the drive for archives:
1 file, 98370 bytes (97 KiB)

Extracting archive: ../sample.xlsx
--
Path = ../sample.xlsx
Type = zip
Physical Size = 98370


No files to process
Everything is Ok

Files: 0
Size:       0
Compressed: 98370
 % 7z a -pmypassword sample.xlsx

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=utf8,Utf16=on,HugeFiles=on,64 bits,4 CPUs x64)

Scanning the drive:
1 file, 6148 bytes (7 KiB)

Creating archive: sample.xlsx

Items to compress: 1


Files read from disk: 1
Archive size: 367 bytes (1 KiB)
Everything is Ok
 % open sample.xlsx

When opened with Excel:

Excel cannot open the file 'sample.xlsx' because the file format or file extension is not valid. Verify that the file has not been corrupted and that the file extension matches the format of the file.

Note the result is the same no matter which type I use with 7z, and the same with zipcloak too.

So far i've looked at my options using Bash and Python, and it seems pretty dire. But i'm pretty much open to anything.. The machines i'm doing this on run OS X and Debian.


Solution

  • What you're asking for isn't currently available in any Python package. The best you can probably do for now is to install a package implemented in some other language, and call that package from Python (using os.system() or the subprocess module or something along those lines).

    The two that I know of are

    secure-spreadsheet is basically a command-line wrapper for xlsx-populate.

    It seems like you want to be able to do this without having Excel installed, but for completeness I'll mention that if you do have Excel installed, then another way to do this is to automate Excel itself, which can be done in Python using xlwings, or the underlying packages that it depends upon: pywin32 on Windows or appscript on Mac.