Python Pandas: Counting the amount of subsequent value and assign a name if conditions are met

For example I have created this data frame:

import pandas as pd

df = pd.DataFrame({'Cycle': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
                             2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4,
                             4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5]})


#Maybe something like this: df['Cycle Type'] = df['Cycle'].rolling(2).apply(lambda x: len(set(x)) != len(x),raw= True).replace({0 : False, 1: True})

I want to count the amount of values and than assign a type of cycle to it. If the cycle has less than 12 rows or more than 100 rows mark it as bad, else mark it as good. I was thinking of using something like that lambda function to check if the value from the row before was the same, but I'm not sure how to add the count feature to give it the parameters I want.

Solution

Start by counting the number of rows in each group with pandas.DataFrame.groupby, pandas.DataFrame.transform, and pandas.DataFrame.count as

df["cycle_quality"] = df.groupby("Cycle")["Cycle"].transform("count")

Then apply the quality function to it using pandas.DataFrame.apply:

• If number of rows is less than 12 and more than 100, define cycle_quality as bad

• Else, cycle_quality should be good

df["cycle_quality"] = df.apply(lambda x: "bad" if x["cycle_quality"] < 12 or x["cycle_quality"] > 100 else "good", axis=1)

[Out]:
    Cycle cycle_quality
0       0          good
1       0          good
2       0          good
3       0          good
4       0          good
..    ...           ...
71      5           bad
72      5           bad
73      5           bad
74      5           bad
75      5           bad