I have a dataframe relating species and sites. Each cell contains a abundance value. I want to select only those species that present an abundance higher than 5 in any site.
Input:
#df
Species_1 Species_2 Species_3
Site_1 0 0 1
Site_2 0 0 0
Site_3 6 0 7
Desirable output:
Species_1 Species3
Site_1 0 1
Site_2 0 0
Site_3 6 7
Imagine that I have a dataframe 150 x 150, and I want to select objetively each column withou writing their names. I managed to write a mask for the dataframe (eg. mask = df > 5), but I do not know how to select only those columns that present at least one "True" value.
Try this
df = pd.DataFrame({'Species_1': {'Site_1': 0, 'Site_2': 0, 'Site_3': 6}, 'Species_2': {'Site_1': 0, 'Site_2': 0, 'Site_3': 0}, 'Species_3': {'Site_1': 1, 'Site_2': 0, 'Site_3': 7}, 'Species_4': {'Site_1': 2, 'Site_2': 2, 'Site_3': 2}})
# Species_1 Species_2 Species_3 Species_4
#Site_1 0 0 1 2
#Site_2 0 0 0 2
#Site_3 6 0 7 2
df.loc[:, (df > 5).any()]
# Species_1 Species_3
#Site_1 0 1
#Site_2 0 0
#Site_3 6 7