Search code examples
pythonpandasdata-sciencepandas-datareader

Determining annual business year in Pandas


I have a DataFrame that contains months and years:

df:
    month   year
0   Jan     2012.0
1   Feb     2012.0
2   Mar     2012.0
3   Apr     2012.0
4   May     2012.0
5   Jun     2012.0
6   Jul     2012.0
7   Aug     2012.0
8   Sep     2012.0
9   Oct     2012.0
10  Nov     2012.0
11  Dec     2012.0

I want to add another column which determines a business-year which starts on Mar on every year Something like this:.

df:
        month   year     business_year
    0   Jan     2012.0     2011
    1   Feb     2012.0     2011
    2   Mar     2012.0     2012
    3   Apr     2012.0     2012
    4   May     2012.0     2012
    5   Jun     2012.0     2012
    6   Jul     2012.0     2012
    7   Aug     2012.0     2012
    8   Sep     2012.0     2012
    9   Oct     2012.0     2012
    10  Nov     2012.0     2012
    11  Dec     2012.0     2012
    12  Jan     2013.0     2012
    13  Feb     2013.0     2012

Solution

  • Assuming your month is a string, you can use the following snippet:

    df['business_year'] = df['year'] + df['month'].apply(lambda x: -1 if x in ('Jan', 'Feb') else 0)
    

    Or, if you want something more performant:

    df['business_year'] = df['year'] + ~df1['month'].isin(('Jan', 'Feb')) - 1