I have an excel file to analyze but have a lot of data that I don't want to analyze, can we delete a column if we don't find the value SpaceX string in the first row like following
SL# State District 10/01/2021 10/01/2021 10/01/2021 11/01/2021 11/01/2021 11/01/2021
SpaceX in Star in StarX out SpaceX out Star out StarX in
1 wb al 10 11 12 13 14 15
2 wb not 23 22 20 24 25 25
Now here I want to delete the columns where in the rows SpaceX not there. And then Want to delete the SpaceX as well to shift up the rows ultimate output will look like as follows
SL# State District 10/01/2021 11/01/2021
1 wb al 10 13
2 wb not 23 24
Tried with loc and iloc functions but no clue at the moment.
Also checked the answer: Drop columns if rows contain a specific value in Pandas but it's different. I'm checking the substring not the exact value match.
Firstly create a boolean mask with startswith()
method and fillna()
method:
mask=df.loc[0].str.startswith('SpaceX').fillna(True)
Finally use Transpose(T)
attribute,loc
accessor and drop()
method:
df=df.T.loc[mask].T.drop(0)
Output of df
:
SL# State District 2021-01-10 00:00:00 2021-01-11 00:00:00 2021-01-12 00:00:00
1 1.0 wb al 10 13 16
2 2.0 wb not 23 13 16