Search code examples
pythonpandasdatetimemasking

Select specific days data for each month in a dataframe


I have a dataframe with daily data, for over 3 years. I would like to construct another dataframe containing the data from the last 5 days of each month. The rows of the 'date' column would be in this case (for the new constructed dataframe) :

2013-01-27
2013-01-28
2013-01-29
2013-01-30
2013-01-31
2013-02-23
2013-02-25
2013-02-26
2013-02-27
2013-02-28

Could someone tell me how I could manage that ?

Many thanks !


Solution

  • One way to do this is to dt.day and dt.days_in_month with boolean indexing:

    df = pd.DataFrame({'Date':pd.date_range('2010-01-01','2013-12-31',freq='D'),
                       'Value':np.random.rand(1461)})
    
    df_out = df[df['Date'].dt.day > df['Date'].dt.days_in_month-5]
    
    print(df_out.head(20))
    

    Output:

              Date     Value
    26  2010-01-27  0.097695
    27  2010-01-28  0.236572
    28  2010-01-29  0.910922
    29  2010-01-30  0.777657
    30  2010-01-31  0.943031
    54  2010-02-24  0.217144
    55  2010-02-25  0.970090
    56  2010-02-26  0.658967
    57  2010-02-27  0.189376
    58  2010-02-28  0.229299
    85  2010-03-27  0.986992
    86  2010-03-28  0.980633
    87  2010-03-29  0.258102
    88  2010-03-30  0.827310
    89  2010-03-31  0.813219
    115 2010-04-26  0.135519
    116 2010-04-27  0.263941
    117 2010-04-28  0.120624
    118 2010-04-29  0.993652
    119 2010-04-30  0.901466