How can I save many dataframes to different existing excel sheets without overwriting other information on those sheets and also change the date format
I tried to save first dataframe
with pd.ExcelWriter('nameofmyexistingexcel.xlsx', mode='a', engine='openpyxl', if_sheet_exists='overlay') as writer:
mydataframe.to_excel(writer, sheet_name="nameofexistingsheet")
(I tried other various options too from pandas doc and from this forum)
It does some interaction with excel, but just breaks excel file , so it can't be open anymore.
When I choose simple option of saving to a new excel, it works fine.
sheet["A1"]
would define a cell object for cell 'A1' (first cell in excel).
To use it you need to define 'sheet' first as a Openpyxl worksheet object like below. Then you can set its attributes like 'number_format'.
import pandas as pd
mydataframe = pd.DataFrame({'Headers': ['Date1', 'Date2'],
'DatesA': ['20.10.2023', '21/10/2023'],
'DatesB': ['20.10.2023', '21/10/2023']
})
mydataframe["DatesA"] = mydataframe["DatesA"].apply(pd.to_datetime, dayfirst=True)
print(mydataframe.dtypes)
with pd.ExcelWriter('nameofmyexistingexcel.xlsx', mode='a', engine='openpyxl', if_sheet_exists='overlay') as writer:
mydataframe.to_excel(writer, sheet_name="nameofexistingsheet")
sheet = writer.sheets['nameofexistingsheet'] # Openpyxl worksheet
# sheet["A1"].number_format = 'dd.mm.yyyy'
for cell in sheet['B']: # Loop column B
if cell.row == 1: # Skip row 1
continue
### Apply number format to all cells in Column B to max used row
cell.number_format = 'dd.mm.yyyy'
UPDATE
I've added the 'mydataframe' dataframe creation to my code example and updated to include two date columns.
Column 'DatesA' is datetime whereas the same dates in Column "DatesB' are default (strings)
DatesA DatesB
0 20.10.2023 20.10.2023
1 21/10/2023 21/10/2023
The data type for the columns is a follows
DatesA datetime64[ns]
DatesB object
Although I didn't include it in the example code all the Dates in columns B and C are formatted as cell.number_format = 'dd.mm.yyyy'
using the column loop.
When the dataframe is written to Excel, the 'DatesA' Column is interpreted as dates by Excel so they take on the format of the date 'dd.mm.yyyy' whether they were that originally or using forward slash as the separator.
The 'DatesB' column is string so its recognised by Excel as a string (you can see the orientation is to the left of the cell) and the number formatting does not change the date '21/10/2023' even though its applied.