For example I have created this data frame:
import pandas as pd
df = pd.DataFrame({'Cycle': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5]})
#Maybe something like this: df['Cycle Type'] = df['Cycle'].rolling(2).apply(lambda x: len(set(x)) != len(x),raw= True).replace({0 : False, 1: True})
I want to count the amount of values and than assign a type of cycle to it. If the cycle has less than 12 rows or more than 100 rows mark it as bad, else mark it as good. I was thinking of using something like that lambda function to check if the value from the row before was the same, but I'm not sure how to add the count feature to give it the parameters I want.
Start by counting the number of rows in each group with pandas.DataFrame.groupby
, pandas.DataFrame.transform
, and pandas.DataFrame.count
as
df["cycle_quality"] = df.groupby("Cycle")["Cycle"].transform("count")
Then apply the quality function to it using pandas.DataFrame.apply
:
• If number of rows is less than 12 and more than 100, define cycle_quality
as bad
• Else, cycle_quality
should be good
df["cycle_quality"] = df.apply(lambda x: "bad" if x["cycle_quality"] < 12 or x["cycle_quality"] > 100 else "good", axis=1)
[Out]:
Cycle cycle_quality
0 0 good
1 0 good
2 0 good
3 0 good
4 0 good
.. ... ...
71 5 bad
72 5 bad
73 5 bad
74 5 bad
75 5 bad