Search code examples
python-3.xpandascsvglob

Error Looping Through Password Protected XLSB and XLSM Files in Python


The below script works for .xlsm and .xlsb files until I try accounting for Excel password protected files. This is an existing process I'm trying to improve and analyze - the sheets I'll be receiving are all password protected as the first 5 characters of the file name, and are all .xlsb or .xlsm.

#Take xlsb Binary files, convert to csv, concatenate, keep file name
import pandas as pd
import os, glob, win32com.client


path = r'C:\Users\user\Desktop\Test Binary'
all_files_test = glob.glob(os.path.join(path, "*.xlsb"))

for file in all_files_test:
    name1 = os.path.splitext(os.path.split(file)[1])[0]
    name2 = name1[0:5]
    excel = win32com.client.Dispatch("Excel.Application")
    excel.Visible = False
    doc = excel.Workbooks.Open(file,False, False, None, name2)
    doc.Sheets(3).Select #3 for xlsm files - verify
    csv_name = os.path.basename(file).replace('xlsb','csv') #xlsb xlsm
    doc.SaveAs(Filename=os.path.join(path, 'CSV Out', csv_name),FileFormat = 6)
    doc.Close(True)
    excel.Quit()
excel.Quit()

files = glob.glob('c:/Users/user/Desktop/Test Binary/CSV Out/*.csv')

dfs = [pd.read_csv(fp).assign(orig_file_name=os.path.basename(fp)) for fp in files]
df = pd.concat(dfs, ignore_index=True)

df.head(10)
df.to_csv('c:/Users/user/Desktop/Test Binary/CSV Out/concat.csv',sep=',')
print("Done, check concatenated file.")

The script works through the first file but when it gets to the second file it errs with:

Traceback (most recent call last):
  File "C:\Users\user\Desktop\xlsb_csv_concat.py", line 28, in <module>
    doc.Sheets(3).Select #3 for xlsm files - verify
AttributeError: 'NoneType' object has no attribute 'Sheets'

I'm sure there are more efficient ways to do what I've written but it works well enough for this specific task, until passwords show up. Thanks in advance for your help.


Solution

  • Just in case anyone stumbles across a similar situation:

    Changing:

    doc = excel.Workbooks.Open(file, False, False, None, name2)
    

    to

    doc = excel.Workbooks.Open(file, False, True, None, name2)
    

    worked for me.