Search code examples
python-3.xpandasdataframecsvpython-datetime

Python giving me datetime Value Error from CSV


I have one Python file that helps me pull and organize datetime data from a server into a pandas DataFrame, and then export out into a nice and usable CSV. My second Python script needs to read the CSV and analyze the data. If I do not touch the exported CSV, the analysis can read the CSV and runs smoothly. However, as soon as I try to merge a few CSV files together with Excel/other spreadsheet software, I get a datetime error

ValueError("time data %r does not match format %r" %    
ValueError: time data '2019-12-26 23:00' does not match format '%Y-%m-%d  %H:%M:%S'

Even though, it is a direct copy/paste and still saved as a CSV. Any guru can provide some insight on this matter?


Solution

  • Pretty sure this is an Excel issue, not a Python problem.

    If you load a .csv that has timestamp strings into Excel, Excel recognizes the datetimes - and formats them. This format seems to default to MM.DD.YYYY hh:mm (the date component might be different depending on your locale): enter image description here

    If you save the file in Excel, the seconds are removed in the .csv!

    • The only procedure that seems to reliably prevent this behavior is to set a specific date/time format for the respective column, e.g. DD.MM.YYYY hh:mm:ss. AFAIK, You'll have to do this manually for each workbook

    • Or perhaps write a macro. In older Excel versions, I had a PERSONAL.XLSB for that; should still work with newer versions, you'll have to put it in C:\Users\[username]\AppData\Roaming\Microsoft\Excel\XLSTART

    • you can use .xlsx format instead of .csv, presumably you won't loose format information there