I can not figure this one out. I have googled so many website and followed so many videso but I simply am stuck. I am sure the solution is simple.... all and any help is appreciated!
I have a dateframe who output is as follows (date is the index):
Date col1 col2 col3 col4 col5
1959-01-01 NaN NaN 1.35 4.21 NaN
1959-02-01 NaN NaN 2.14 6.30 5.75
1959-03-01 1.97 NaN NaN 7.35 6.23
1959-04-01 2.19 3.14 NaN NaN 7.15
1959-05-01 3.16 2.74 NaN NaN 8.42
1959-06-01 2.91 3.63 NaN NaN 8.36
1959-07-01 2.72 4.98 NaN NaN NaN
I want to delete columns that have NaN between the dates 1959-03-01 and 1959-06-01. I want the output to look like this:
Date col1 col5
1959-01-01 NaN NaN
1959-02-01 NaN 5.75
1959-03-01 1.97 6.23
1959-04-01 2.19 7.15
1959-05-01 3.16 8.42
1959-06-01 2.91 8.36
1959-07-01 2.72 NaN
Thanks for your help!
First, you need to find the indices for the dates:
a = np.where(df['Date'] == '1959-03-01')[0]
b = np.where(df['Date'] == '1959-06-01')[0]
Second, you need to check the columns in-between each date:
to_drop = []
for column in df.columns:
check = df[column].between(a, b, inclusive='both')
if any(np.isnan(check)): to_drop.append(column)
Third, you need to drop the columns that meet the criteria:
df = df.drop(to_drop, axis=1)