I'm making a tool to import a .txt file and export it as a .xslx file using pd.ExcelWriter. For import I use pd.read_csv
which works fine. Export the dataframe as a .xslx works also fine except that the datetimes are not recognized by excel:
csv_file = pd.read_csv(ex_name,sep=';',skip_blank_lines=False,index_col=False,header=None,usecols=[0,1,2,3,4,5,6,7,8,9])
writer = pd.ExcelWriter('output.xlsx',
engine='xlsxwriter',
options={'strings_to_numbers': True},
datetime_format='yyyy-mm-dd hh:mm')
csv_file.to_excel(writer,sheet_name='Sheet1',index=False,header=False)
writer.save()
So I added a few lines before the writer.save() to change the format:
workbook = writer.book
worksheet = writer.sheets['Sheet1']
formatdict = {'num_format':'yyyy-mm-dd hh:mm'}
fmt = workbook.add_format(formatdict)
worksheet.write_datetime('C6:C{0}'.format(int(len(date_export)+5)),date_export),fmt)
Where csv_file
is the imported .txt file, date_export
the date serie I tried to overwrite. But I got an typeerror: Unknown or unsupported datetime type
I also tried to append a new date range created with pd.date_range
:
daterange = pd.date_range(start=date_export[0], periods=len(date_export),
freq='D', normalize=True)
With dtype='datetime64[ns]
but I got the same error.
So exporting using pd.ExcelWriter
works (while using the first code block only) but get excel recognize the datetime doesn't work due to the error in the python console.
FULL ERROR
Traceback (most recent call last):
File "D:\txt_to_excel.py", line 399, in <module>
worksheet.write_datetime('C6:C{0}'.format(int(len(daterange)+5)),daterange)#,fmt)
File "C:\ProgramData\Anaconda3\lib\site-packages\xlsxwriter\worksheet.py", line 69, in cell_wrapper
return method(self, *args, **kwargs)
File "C:\ProgramData\Anaconda3\lib\site-packages\xlsxwriter\worksheet.py", line 770, in write_datetime
return self._write_datetime(row, col, date, cell_format)
File "C:\ProgramData\Anaconda3\lib\site-packages\xlsxwriter\worksheet.py", line 784, in _write_datetime
number = self._convert_date_time(date)
File "C:\ProgramData\Anaconda3\lib\site-packages\xlsxwriter\worksheet.py", line 3892, in _convert_date_time
self.remove_timezone)
File "C:\ProgramData\Anaconda3\lib\site-packages\xlsxwriter\utility.py", line 677, in datetime_to_excel_datetime
raise TypeError("Unknown or unsupported datetime type")
TypeError: Unknown or unsupported datetime type
FIRST 10 LINES INPUT FILE
1;2;3;4;5;6;7;8;9;
LOCATION;FILTER NR;X COORD;Y COORD;SURFACE LEV (m-ref);TOP FILTER (m-ref);BOTTOM FILTER (m-ref);MEAS POINT LEV (m-ref);SED SUMP LENGTH (m);START DATE;
pb 1-1;1;1;1;1;1;1;;;2006-08-17 00:00:00
LOCATION;FILTER NR;DATE;VALUE (m-ref)
pb 1-1;1;2006-08-17 00:00:00;40.384
pb 1-1;1;2006-08-18 00:00:00;40.337
pb 1-1;1;2006-08-19 00:00:00;40.296
pb 1-1;1;2006-08-20 00:00:00;40.321
pb 1-1;1;2006-08-21 00:00:00;40.358000000000004
DESIRED OUTPUT
Is perfect except for the fact that the datetime isn't recognized.
I tried a variety of different Pandas parse_dates
and similar options but I couldn't get them to work with the format of your input data. Perhaps someone else will figure that out and post a working example.
As an alternative here is a version that extends the default Xlsxwriter worksheet.write()
method to convert date-like strings to Excel dates on the fly.
import pandas as pd
from datetime import datetime
ex_name = 'so63718752.txt'
csv_file = pd.read_csv(ex_name, sep=';', skip_blank_lines=False,
index_col=False, header=None,
usecols=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
writer = pd.ExcelWriter('output.xlsx',
engine='xlsxwriter',
options={'strings_to_numbers': True},
datetime_format='yyyy-mm-dd hh:mm')
csv_file.to_excel(writer, sheet_name='Sheet1', index=False, header=False)
# Get the xlsxwriter workbook and worksheet object used by Pandas.
workbook = writer.book
worksheet = writer.sheets['Sheet1']
# Make the date columns wider for clarity in this example.
worksheet.set_column(2, 2, 12, None)
worksheet.set_column(9, 9, 12, None)
# Create a date format to use in Excel.
date_format = workbook.add_format({'num_format': 'yyyy-mm-dd'})
# Create a function that changes the worksheet write() method so
# that strings are evaluated into dates or just handled normally.
def string_to_date(worksheet, row, col, cell, cell_format=None):
try:
# Try convert the cell string to a date.
date_time = datetime.strptime(cell, '%Y-%m-%d %H:%M:%S')
# If it works then write it as an Excel date.
return worksheet.write_datetime(row, col, date_time, date_format)
except:
# If it doesn't work then return control to the calling write() method.
return None
# Add the write() handler/callback to the worksheet.
worksheet.add_write_handler(str, string_to_date)
# Write the file again. This is inefficient but, I think, unavoidable.
csv_file.to_excel(writer, sheet_name='Sheet1', index=False, header=False)
writer.save()
Output (note that the data type is "Date" in Excel):
See Writing user defined types in the XlsxWriter docs for an explanation of how this works.
This is somewhat on the advanced end of XlsxWriter, and any working Pandas example would probably be complex as well. A better and simpler approach would probably be to just read the data using csv.py and convert it an Excel file using XlsxWriter directly.