I have the df shown below:
CX CY CS
97539 0.39896 0.7787 0
97540 0.39896 0.7787 0
97541 0.39896 0.7787 0
97542 0.39896 0.7787 0
97543 0.39896 0.7787 0
97544 0.39896 0.7787 0
97545 0.39896 0.7787 0
97546 0.39896 0.7787 0
97547 0.39896 0.7787 0
97548 0.39896 0.7787 0
97549 0.39896 0.7787 0
97550 0.39896 0.7787 0
97551 0.39896 0.7787 0
97552 0.39896 0.7787 0
97553 0.39896 0.7787 0
97554 0.39896 0.7787 0
97555 0.39896 0.7787 0
97556 0.39896 0.7787 0
97557 0.39896 0.7787 0
97558 0.39896 0.7787 0
97559 0.39896 0.7787 0
97560 0.39896 0.7787 0
97561 0.39896 0.7787 1
97562 0.39896 0.7787 0
97563 0.39896 0.7787 0
97564 0.39896 0.7787 0
97565 0.39896 0.7787 0
I want keep only the part of the df up to the point when the value on the 'CS' column becomes 1 and drop the remaining rows. So I want to have sth like this:
CX CY CS
97539 0.39896 0.7787 0
97540 0.39896 0.7787 0
97541 0.39896 0.7787 0
97542 0.39896 0.7787 0
97543 0.39896 0.7787 0
97544 0.39896 0.7787 0
97545 0.39896 0.7787 0
97546 0.39896 0.7787 0
97547 0.39896 0.7787 0
97548 0.39896 0.7787 0
97549 0.39896 0.7787 0
97550 0.39896 0.7787 0
97551 0.39896 0.7787 0
97552 0.39896 0.7787 0
97553 0.39896 0.7787 0
97554 0.39896 0.7787 0
97555 0.39896 0.7787 0
97556 0.39896 0.7787 0
97557 0.39896 0.7787 0
97558 0.39896 0.7787 0
97559 0.39896 0.7787 0
97560 0.39896 0.7787 0
97561 0.39896 0.7787 1
Any ideas how to approach it? Note that the value of 1 can be at any line, so I can't just use .iloc(). Ideally, I would like to avoid itterows().
If there is always at least one 1
is possible compare values by Series.eq
and then get index of first 1
by Series.idxmax
, last filter by DataFrame.loc
:
df1 = df.loc[: df['CS'].eq(1).idxmax()]
Solution working if also no 1
value - then return empty DataFrame:
m = df['CS'].eq(1)
df1 = df.loc[: m.idxmax()] if m.any() else pd.DataFrame()
Or use trick with Series.cummax
in boolean indexing
, only is necessary change order 2 times:
df1 = df[df['CS'].iloc[::-1].eq(1).cummax().iloc[::-1]]