Search code examples
pythonexcelpandasdataframexlwings

How to write a pandas df into Excel using Python xlwings with skip_blanks enabled?


My python script generates a pandas dataframe containing some NaN values:

       A         B         C         D
0.351741  NaN       0.238705  NaN     
0.950817  0.665594  0.671151  NaN     
NaN       0.442725  0.658816  NaN     
0.155604  0.567044  NaN       0.666576
NaN       0.751562  NaN       0.597252
0.577770  NaN       NaN       0.123392

I want to write this df into an excel spreadsheet using xlwings. The important point is that I need to skip the cells set to NaN and preserve the corresponding excel cells. In my case, the cells that I want to preserve can contain live excel formula, values or even empty cells.

If my excel values are:

      A         B         C         D
Excel A1  Excel B1  Excel C1  Excel D1 
Excel A2  Excel B2  Excel C2  Excel D2 
Excel A3  Excel B3  Excel C3  Excel D3 
Excel A4  Excel B4  Excel C4  Excel D4
Excel A5  Excel B5  Excel C5  Excel D5
Excel A6  Excel B6  Excel C6  Excel D6

The final output should be:

       A         B         C         D
0.351741  Excel B1  0.238705  Excel D1
0.950817  0.665594  0.671151  Excel D2
Excel A3  0.442725  0.658816  Excel D3
0.155604  0.567044  Excel C4  0.666576
Excel A5  0.751562  Excel C5  0.597252
0.577770  Excel B6  Excel C6  0.123392

Usually, I write to Excel using a simple xlwings call:

r = 'A1:D6'
sht.range(r).options(index=False, header=False).value = df

With this call, the NaN would overwrite the excel cell by an empty cell. Loosing the values that I want to preserve. In our previous example, it means that I would obtain this in Excel:

       A         B         C         D
0.351741  BLANK     0.238705  BLANK   
0.950817  0.665594  0.671151  BLANK   
BLANK     0.442725  0.658816  BLANK   
0.155604  0.567044  BLANK     0.666576
BLANK     0.751562  BLANK     0.597252
0.577770  BLANK     BLANK     0.123392

I checked xlwings documentation and its code. It seems that the parameter skip_blanks is only implemented for the function paste(). It is set to False by default. The only workaround that I currently found is to use a temporary sheet where I save my df. Then, I copy to the clipboard the range where it was saved. Finally, I can paste it where I want using skip_blanks=True obtaining the expected output:

tmp_sht.range(r).options(index=False, header=False).value = df 
tmp_sht.range(r).copy(destination=None)
sht.range(r).paste(paste='values', skip_blanks=True) 

I don't like to use this temporary sheet. I suppose xlwings should be able to handle a direct call using the skip_blanks, something that would look like this:

sht.range(r).options(index=False, header=False, skip_blanks=True).value = df

Is it possible to do it?

Many thanks in advance!


Solution

  • A better way to solve this issue is based on the idea of @mozway to move the whole process to python by retrieving the values we want to preserve and simply combining the two dataframes:

    out = df1.combine_first(df2)
    

    You can retrieve the formulas from the excel file using the formula parameter of a Range. Depending on the types of your values, you could simply replace the NaN like this:

    formulas = sht.range(r).options(pd.DataFrame).formula
    df = df.fillna(formulas)  # replace the `NaN` with the formulas 
    

    This actually didn't work for me due to some dtypes that the function fillna cannot handle. But I adapted it by changing the index and columns of the retrieved df to be able to use combine_first:

    formulas = pd.DataFrame(list(sht.range(r).formula))
    formulas.index = df.index
    formulas.columns = df.columns
    
    df = df.combine_first(formulas)
    sht.range(r).options(index=False, header=False).value = df
    

    By doing this the live excel formula are not overwritten by their values:

           A         B         C         D
    0.351741  Excel B1  0.238705  Excel D1
    0.950817  0.665594  0.671151  Excel D2
    Excel A3  0.442725  0.658816  Excel D3
    0.155604  0.567044  Excel C4  0.666576
    Excel A5  0.751562  Excel C5  0.597252
    0.577770  Excel B6  Excel C6  0.123392