Search code examples
pythonbatch-fileunzip

Batch file to unzip files and run Python script


I have a project where I need to unzip files containing a .xls file and then run a Python script to convert the file to .xlsx. I need to do this by executing a Windows Batch file through the Windows Task Scheduler. I found some code on Superuser to unzip the file, but it does not loop and you have to explicitly type the name of the file you want to unzip.

Here's the unzip code:

@echo off
setlocal
cd /d %~dp0
Call :UnZipFile "C:\Test Folder\" "C:\Test Folder\File1.zip"
exit /b

:UnZipFile <ExtractTo> <newzipfile>
IF "%~dp0"=="" GOTO Continue
set vbs="%temp%\_.vbs"
if exist %vbs% del /f /q %vbs%
>%vbs%  echo Set fso = CreateObject("Scripting.FileSystemObject")
>>%vbs% echo If NOT fso.FolderExists(%1) Then
>>%vbs% echo fso.CreateFolder(%1)
>>%vbs% echo End If
>>%vbs% echo set objShell = CreateObject("Shell.Application")
>>%vbs% echo set FilesInZip=objShell.NameSpace(%2).items
>>%vbs% echo objShell.NameSpace(%1).CopyHere(FilesInZip)
>>%vbs% echo Set fso = Nothing
>>%vbs% echo Set objShell = Nothing
cscript //nologo %vbs%
if exist %vbs% del /f /q %vbs%

Here's my Python code:

import win32com.client as win32
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--sourcefilepath')
parser.add_argument('--sourcefilename')
args = parser.parse_args()

fname = r"{sourcefilepath}{sourcefilename}".format(sourcefilepath=args.sourcefilepath,sourcefilename=args.sourcefilename)

excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(fname)
wb.SaveAs(fname + "x", FileFormat = 56)

wb.Close()                                 
excel.Application.Quit()
quit()

Any help will be great.


Solution

  • @echo off
    setlocal
    cd /d "%~dp0" || exit /b 1
    
    for %%A in (*.zip) do (
        call :UnZipFile "%cd%\" "%%~fA"
    
        for %%B in (*.xls) do (
            echo python mypython.py --sourcefilepath "%cd%" --sourcefilename "%%~nxB"
            del "%%~B"
        )
    )
    exit /b
    
    :UnZipFile <ExtractTo> <newzipfile>
    if "%~dp0" == "" exit /b 1
    set "vbs=%temp%\_.vbs"
    if exist "%vbs%" del /f /q "%vbs%"
    
    (
        echo Set fso = CreateObject("Scripting.FileSystemObject"^)
        echo If NOT fso.FolderExists("%~1"^) Then
        echo fso.CreateFolder("%~1"^)
        echo End If
        echo set objShell = CreateObject("Shell.Application"^)
        echo set FilesInZip=objShell.NameSpace("%~2"^).items
        echo objShell.NameSpace("%~1"^).CopyHere FilesInZip, 16
        echo Set fso = Nothing
        echo Set objShell = Nothing
    ) > "%vbs%"
    
    cscript //nologo "%vbs%"
    if exist "%vbs%" del /f /q "%vbs%"
    

    Just some minor fixes to special characters needing escaping in the echo commands. Replaced goto continue with exit /b 1. Added simple for loop to get all zip filenames.

    It will get xls files in the current directory and delete the xls files once the python script has done the SaveAs task.

    Remove echo in front of python command if checked as OK.


    Or try this python code which does the looping, unzipping, and xls to xlsx.

    import win32com.client as win32
    import argparse
    import glob
    import os
    import zipfile
    
    def xls_to_xlsx(filename, filepath=None):
        if not filepath:
            filepath = os.getcwd()
    
        fname = os.path.join(filepath, filename)
    
        excel = win32.gencache.EnsureDispatch('Excel.Application')
        wb = excel.Workbooks.Open(fname)
        wb.SaveAs(fname + "x", FileFormat = 56)
    
        wb.Close()
        excel.Application.Quit()
    
    if __name__ == '__main__':
        for zip_file in glob.iglob('*.zip'):
            with zipfile.ZipFile(zip_file) as r:
                r.extractall()
    
            for xls_file in glob.iglob('*.xls'):
                xls_to_xlsx(xls_file)
                os.remove(xls_file)
    

    It does not set current working directory which Task Scheduler can set it, or you can add it into the python script to set it.