Search code examples
pythonpandasstring-to-datetime

Datetime string with whitespace, local date


I am trying to convert a datetime string (German) that comes from MS Project Excel Export.

02 Februar 2022 17:00

I read it from a Excel-Export of MS Project in to a pandas dataframe.

When converting it with

to_datetime(df["Anfang"], format= '%d %B %Y %H:%M').dt.date

but get the error

ValueError: time data '07 Januar 2019 07:00' does not match format '%d %B %Y %H:%M' (match)

from https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior

%B   Month as locale’s full name.   September

What I am doing wrong here? Do I have to check some local settings? I am using German(Swiss)

import locale
locale.getdefaultlocale()
('de_CH', 'cp1252')

df in:

0       10 April 2019 08:00
1      07 Januar 2019 07:00
2      07 Januar 2019 07:00
3      07 Januar 2019 07:00
4     09 Oktober 2019 17:00
5    04 Dezember 2020 17:00
Name: Anfang, dtype: object

df out (wanted):

0       10-04-2019
1       07-01-2019
.
.

EDIT: I changed my locale to ('de_DE', 'cp1252'), but I get the same error.

SOLVED: By using matJ's answer, I got the error that "Die 15.06.21" was not matching the format, which led me to investigate the data. There I found two different date formats (Thanks, Microsoft!). After cleaning, the above code worked well!!! So the error message of to_datetime wasn't precise as datetime.strptime.

Thanks for helping.

Johannes


Solution

  • I'd change the locale in a different way. Then your code should work.

    The following works for me:

    import locale
    from datetime import datetime 
    
    locale.setlocale(locale.LC_ALL, 'de_DE')  # changing locale to german 
    datetime.strptime('07 Januar 2019 07:00', '%d %B %Y %H:%M')  # returns a datetime obj which you can format as you like 
    

    Let me know if that works for you as well.