Search code examples
pythonpandaspandas.excelwriter

pd.ExcelWriter, append datetime but got: Unknown or unsupported datetime type


I'm making a tool to import a .txt file and export it as a .xslx file using pd.ExcelWriter. For import I use pd.read_csv which works fine. Export the dataframe as a .xslx works also fine except that the datetimes are not recognized by excel:

csv_file = pd.read_csv(ex_name,sep=';',skip_blank_lines=False,index_col=False,header=None,usecols=[0,1,2,3,4,5,6,7,8,9])

writer = pd.ExcelWriter('output.xlsx',
                        engine='xlsxwriter',
                        options={'strings_to_numbers': True},
                        datetime_format='yyyy-mm-dd hh:mm')
    
csv_file.to_excel(writer,sheet_name='Sheet1',index=False,header=False)
writer.save()

So I added a few lines before the writer.save() to change the format:

workbook  = writer.book
worksheet = writer.sheets['Sheet1']
formatdict = {'num_format':'yyyy-mm-dd hh:mm'}
fmt = workbook.add_format(formatdict)

worksheet.write_datetime('C6:C{0}'.format(int(len(date_export)+5)),date_export),fmt)

Where csv_file is the imported .txt file, date_export the date serie I tried to overwrite. But I got an typeerror: Unknown or unsupported datetime type

I also tried to append a new date range created with pd.date_range:

daterange = pd.date_range(start=date_export[0], periods=len(date_export),
                              freq='D', normalize=True)

With dtype='datetime64[ns] but I got the same error.

So exporting using pd.ExcelWriter works (while using the first code block only) but get excel recognize the datetime doesn't work due to the error in the python console.

FULL ERROR

Traceback (most recent call last):

  File "D:\txt_to_excel.py", line 399, in <module>
    worksheet.write_datetime('C6:C{0}'.format(int(len(daterange)+5)),daterange)#,fmt)

  File "C:\ProgramData\Anaconda3\lib\site-packages\xlsxwriter\worksheet.py", line 69, in cell_wrapper
    return method(self, *args, **kwargs)

  File "C:\ProgramData\Anaconda3\lib\site-packages\xlsxwriter\worksheet.py", line 770, in write_datetime
    return self._write_datetime(row, col, date, cell_format)

  File "C:\ProgramData\Anaconda3\lib\site-packages\xlsxwriter\worksheet.py", line 784, in _write_datetime
    number = self._convert_date_time(date)

  File "C:\ProgramData\Anaconda3\lib\site-packages\xlsxwriter\worksheet.py", line 3892, in _convert_date_time
    self.remove_timezone)

  File "C:\ProgramData\Anaconda3\lib\site-packages\xlsxwriter\utility.py", line 677, in datetime_to_excel_datetime
    raise TypeError("Unknown or unsupported datetime type")

TypeError: Unknown or unsupported datetime type

FIRST 10 LINES INPUT FILE

1;2;3;4;5;6;7;8;9;
LOCATION;FILTER NR;X COORD;Y COORD;SURFACE LEV (m-ref);TOP FILTER (m-ref);BOTTOM FILTER (m-ref);MEAS POINT LEV (m-ref);SED SUMP LENGTH (m);START DATE;
pb 1-1;1;1;1;1;1;1;;;2006-08-17 00:00:00

LOCATION;FILTER NR;DATE;VALUE (m-ref)
pb 1-1;1;2006-08-17 00:00:00;40.384
pb 1-1;1;2006-08-18 00:00:00;40.337
pb 1-1;1;2006-08-19 00:00:00;40.296
pb 1-1;1;2006-08-20 00:00:00;40.321
pb 1-1;1;2006-08-21 00:00:00;40.358000000000004

DESIRED OUTPUT

Is perfect except for the fact that the datetime isn't recognized. enter image description here


Solution

  • I tried a variety of different Pandas parse_dates and similar options but I couldn't get them to work with the format of your input data. Perhaps someone else will figure that out and post a working example.

    As an alternative here is a version that extends the default Xlsxwriter worksheet.write() method to convert date-like strings to Excel dates on the fly.

    import pandas as pd
    from datetime import datetime
    
    ex_name = 'so63718752.txt'
    
    csv_file = pd.read_csv(ex_name, sep=';', skip_blank_lines=False,
                           index_col=False, header=None,
                           usecols=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
    
    writer = pd.ExcelWriter('output.xlsx',
                            engine='xlsxwriter',
                            options={'strings_to_numbers': True},
                            datetime_format='yyyy-mm-dd hh:mm')
    
    csv_file.to_excel(writer, sheet_name='Sheet1', index=False, header=False)
    
    # Get the xlsxwriter workbook and worksheet object used by Pandas.
    workbook = writer.book
    worksheet = writer.sheets['Sheet1']
    
    # Make the date columns wider for clarity in this example.
    worksheet.set_column(2, 2, 12, None)
    worksheet.set_column(9, 9, 12, None)
    
    # Create a date format to use in Excel.
    date_format = workbook.add_format({'num_format': 'yyyy-mm-dd'})
    
    
    # Create a function that changes the worksheet write() method so
    # that strings are evaluated into dates or just handled normally.
    def string_to_date(worksheet, row, col, cell, cell_format=None):
        try:
            # Try convert the cell string to a date.
            date_time = datetime.strptime(cell, '%Y-%m-%d %H:%M:%S')
            # If it works then write it as an Excel date.
            return worksheet.write_datetime(row, col, date_time, date_format)
        except:
            # If it doesn't work then return control to the calling write() method.
            return None
    
    # Add the write() handler/callback to the worksheet.
    worksheet.add_write_handler(str, string_to_date)
    
    # Write the file again. This is inefficient but, I think, unavoidable.
    csv_file.to_excel(writer, sheet_name='Sheet1', index=False, header=False)
    
    writer.save()
    

    Output (note that the data type is "Date" in Excel):

    enter image description here

    See Writing user defined types in the XlsxWriter docs for an explanation of how this works.

    This is somewhat on the advanced end of XlsxWriter, and any working Pandas example would probably be complex as well. A better and simpler approach would probably be to just read the data using csv.py and convert it an Excel file using XlsxWriter directly.