I have a dataframe that needs to be kept in the structure below.
df
One two three
date apple banana
2019-12-20 0 4
2020-01-03 10 5
Is there a way to keep only columns where the second row is contained within the list? I'm only familiar with doing this for columns
fruits= ["date", "apple"]
df = df.loc[:, df.columns.isin(fruits)
Expected Output
One two
date apple
2019-12-20 0
2020-01-03 10
use .isin
and use the level
argument to specify the level of the multiindex.
if you don't have a multiindex set, you can set it on your read.
df = pd.read_csv('file.csv',sep='\s+',header=[0,1])
df.loc[:,df.columns.isin(fruits,level=1)]
print(df)
One two
date apple
2 2019-12-20 0
3 2020-01-03 10
df.columns.isin(fruits,level=1)
array([ True, True, False])