Search code examples
pythondatetimetext-parsing

Python: How to parse a date from text formatted like excel's "full date"?


I have a CSV report (from a system that I have no control) with the date column in the format "Quarta-feira, 1 de Janeiro de 2020", which I believe corresponds to the format code '%A, %d de %B de %Y' with pt-br locale.

I need to create a datetime like object from such string but I am trying this without success.

import locale
import pandas as pd

locale.set(locale.LC_TIME, 'pt_BR.utf8')

pd.to_datetime("Quarta-feira, 1 de Janeiro de 2020", format="A%, %d de %B de %Y")

Does anyone knows how could I do that?

p.s.:

  • I am aware that I could remove the "Quarta-feira," from the string and it will work with format="%d de %B de %Y".
  • I don't need to work with pandas, it can be any date/time lib.

Solution

  • Based on this solution and noting that you don't mind using additional libs. You can use dateparser. This worked like a charm for me:

    #!pip install dateparser
    import dateparser
    t = "Quarta-feira, 1 de Janeiro de 2020"
    dateparser.parse(t)
    >>datetime.datetime(2020, 1, 1, 0, 0)