Search code examples
pythonpython-3.xpandasdatetime-formatpython-datetime

Inconsistent Date format in python


I am new to Python, I have a file having date column as below formats:

date = pd.Series(['10-21-2012 ', '7-18-2019 ', '02-2-2008', 
                  '2010/21/4 ', '11-8-2019 ']) 

I used the following code to get the month but I get an error:

ValueError: month must be in 1..12

Code:

pd.to_datetime(date).dt.month

The output should be

10
7
02
4
11

Please can someone help me with this?


Solution

  • welcome!
    You could "normalize" the date list before passing it the Pandas Series object.
    Create a function that can do it and you could also use it somewhere else in your code should you require it.
    From your series it seems like you have two main formats that the dates are arranged by:
    - mm-dd-yyyy
    - yyyy/dd/mm

    def get_months(date_list):  
        month_list = []  
        m = ''
        for dt_string in date_list:
            if "-" in dt_string:
                numbers = [int(x) for x in dt_string.split("-")]
                m = f'{numbers[0]}-{numbers[1]}-{numbers[2]}'
                month_list.append(m)
            elif "/" in dt_string:
                numbers = [int(x) for x in dt_string.split("/")]
                m = f'{numbers[2]}-{numbers[1]}-{numbers[0]}'
                month_list.append(m)
        return month_list
    
    dates = ['10-21-2012', '7-18-2019', '02-2-2008', '2010/21/4', '11-8-2019']
    
    months = get_months(dates)
    
    [print(x) for x in months]
    

    This would create a list that look like:

    ['10-21-2012','7-18-2019','2-2-2008','4-21-2010','11-8-2019']  
    

    Let me know if you have special requirements that would not make this work.