Search code examples
pythonpandaspython-3.6strftime

AttributeError: 'str' object has no attribute 'strftime' when modifying pandas dataframe


I have an issue with subtracting a part of the a date column. My goal is to set up a cohort analysis table. In order to achieve this I would like to get the ordering month of any customer in my pandas dataframe.

      OrderDate  ConsumerId  Orders
0  2017-09-01      5555555        4
1  2017-09-01  66666666666        2
2  2017-09-01   88888888888       4
3  2017-09-01   9999999999        3
4  2017-09-01     45858888        3   

I used the code below to do this.

import pandas as pd

pd.set_option('max_columns', 50)
mpl.rcParams['lines.linewidth'] = 2

df = pd.read_csv("C:/Users/***/****/salesdata.csv")

df.head()

df['OrderPeriod'] = df.OrderDate.apply(lambda x: x.strftime('%Y-%m'))

Unfurtantely I get the following error when trying to subtract the ordering month.

 df['OrderPeriod'] = df.OrderDate.apply(lambda x: x.strftime('%Y-%m'))
 AttributeError: 'str' object has no attribute 'strftime'

My pandas dataframe looks like this:

      OrderDate  ConsumerId  Orders
0  2017-09-01      5555555        4
1  2017-09-01  66666666666        2
2  2017-09-01   88888888888       4
3  2017-09-01   9999999999        3
4  2017-09-01     45858888        3   

My goal is to add a new column with the month in which a consumer ordered. Like below.

       OrderDate  ConsumerId  Orders   OrderPeriod
0  2017-09-01      5555555        4      2017-09
1  2017-09-01  66666666666        2      2017-09
2  2017-09-01   88888888888       4      2017-09
3  2017-09-01   9999999999        3      2017-09
4  2017-09-01     45858888        3      2017-09

I was wondering why I get an AttributeError and how I could solve this issue.


Solution

  • Use vectorized dt.strftime, but first convert column to datetime in read_csv by parameter parse_dates:

    df = pd.read_csv("C:/Users/***/****/salesdata.csv", parse_dates=['OrderDate'])
    
    df['OrderPeriod'] = df.OrderDate.dt.strftime('%Y-%m')
    print (df)
       OrderDate   ConsumerId  Orders OrderPeriod
    0 2017-09-01      5555555       4     2017-09
    1 2017-09-01  66666666666       2     2017-09
    2 2017-09-01  88888888888       4     2017-09
    3 2017-09-01   9999999999       3     2017-09
    4 2017-09-01     45858888       3     2017-09