Search code examples
pythondataframefiltersplice

Drop columns in a dataframe where values between 2 dates are NaN


I can not figure this one out. I have googled so many website and followed so many videso but I simply am stuck. I am sure the solution is simple.... all and any help is appreciated!

I have a dateframe who output is as follows (date is the index):

Date        col1   col2   col3   col4   col5  
1959-01-01   NaN    NaN   1.35   4.21    NaN
1959-02-01   NaN    NaN   2.14   6.30    5.75
1959-03-01   1.97   NaN   NaN    7.35    6.23
1959-04-01   2.19   3.14  NaN    NaN     7.15
1959-05-01   3.16   2.74  NaN    NaN     8.42
1959-06-01   2.91   3.63  NaN    NaN     8.36
1959-07-01   2.72   4.98  NaN    NaN     NaN

I want to delete columns that have NaN between the dates 1959-03-01 and 1959-06-01. I want the output to look like this:

Date        col1      col5  
1959-01-01   NaN      NaN
1959-02-01   NaN      5.75
1959-03-01   1.97     6.23
1959-04-01   2.19     7.15
1959-05-01   3.16     8.42
1959-06-01   2.91     8.36
1959-07-01   2.72     NaN

Thanks for your help!


Solution

  • First, you need to find the indices for the dates:

    a = np.where(df['Date'] == '1959-03-01')[0]
    b = np.where(df['Date'] == '1959-06-01')[0]
    

    Second, you need to check the columns in-between each date:

    to_drop = []
    for column in df.columns:
        check = df[column].between(a, b, inclusive='both')
        if any(np.isnan(check)): to_drop.append(column)
    

    Third, you need to drop the columns that meet the criteria:

    df = df.drop(to_drop, axis=1)