Search code examples
shellcsvawkxlsx

pasting a value to xlsx using shell


I have a template file in xlsx format and I want to paste a dynamic value in one particular cell i.e based on the flow of program the value in that cell will change which in turn changes conditions in xlsx file for a different process.

I have tried codes like

awk -v value=$value -v row=$row -v col=$col 'BEGIN{FS=OFS="@"} NR==row {$col=value}1' file.csv 

but the issue is I cant use this code for xlsx file format. is there any way to do this for xlsx file format, since it's a template file I need to retain xlsx file format.


Solution

  • When I have to extract values from an Excel workbook on my Windows PC I install cygwin and then write a small shell script that does:

    cygstart "/path/to/xls2csv.vbs" 'C:/cygwin64/path/to/bookName.xlsx'
    awk 'whatever' '/path/to/bookName/sheetName.csv'
    

    and the work of extracting every sheet from the workbook as a separate CSV named based on the sheet name suffixed with ".csv" under a common directory named after the workbook is done by this visual basic script:

    $ cat xls2csv.vbs
    csv_format = 6
    
    Dim strFilename
    Dim objFSO
    Set objFSO = CreateObject("scripting.filesystemobject")
    strFilename = objFSO.GetAbsolutePathName(WScript.Arguments(0))
    If objFSO.fileexists(strFilename) Then
      Call Writefile(strFilename)
    Else
      wscript.echo "no such file!"
      wscript.echo strFilename
    End If
    Set objFSO = Nothing
    
    Sub Writefile(ByVal strFilename)
    Dim objExcel
    Dim objWB
    Dim objws
    
    Set objExcel = CreateObject("Excel.Application")
    Set objWB = objExcel.Workbooks.Open(strFilename)
    
    For Each objws In objWB.Sheets
      objws.Copy
      objExcel.ActiveWorkbook.SaveAs objWB.Path & "\" & objws.Name & ".csv", csv_format
      objExcel.ActiveWorkbook.Close False
    Next
    
    objWB.Close False
    objExcel.Quit
    Set objExcel = Nothing
    End Sub
    

    That command would fail given blanks in file or directory names so we need to replace those with, say, underscores. In reality I usually copy the Xls file to a temp directory and give it a temp name before running the above on it so I can run the above on it without affecting the original file and without having to care about the path to the original file. It requires an absolute path to the input Excel workbook.

    You might need to throw a wait and/or sleep in before the awk command to ensure the VB script is done before the awk command runs. My not shown shell code is kinda convoluted testing for the VB script creating then removing tmp files to ensure the VB script is done and looping trying and then killing Excel if it doesn't start or hangs before calling awk - I wrote it a long time ago, it's a mess, and I doubt if it's really necessary or a good approach which is why I'm not including it here.

    To get those values back INTO a multi-sheet workbook you'd have to open any updated/generated CSV with Excel (or copy/paste). There's probably some other VB script could be written to import the CSVs for you just like I export them above but I've never needed that functionality so idk what that'd look like.

    I don't know if you need that though - if your awk script writes CSV then you can just double click on the output .csv and Excel will happily open and display it just like it would any .xls or .xlsx Excel file.

    So, to do what you want, assuming your original content is in "Sheet1" of single-sheet Excel workbook "MyStuff.xlsx" you'd do this from cygwin:

    cygstart "/path/to/xls2csv.vbs" 'C:/cygwin64/path/to/MyStuff.xlsx'
    wait; sleep 10     # or similar
    awk -v value="$value" -v row="$row" -v col="$col" 'BEGIN{FS=OFS=","} NR==row {$col=value}1' '/path/to/MyStuff/Sheet1.csv' > "/tmp/tmp$$" &&
    mv "/tmp/tmp$$" '/path/to/MyStuff/Sheet1.csv'
    

    and then in Windows just double-click on /path/to/MyStuff/Sheet1.csv to open it in Excel (you may need to associate the .csv file suffix with Excel the first time you do that).

    Note that the above will only handle simple CSVs, see What's the most robust way to efficiently parse CSV using awk? for how to robustly handle CSVs with awk in general.