I want to apply custom functions to pandas groupby function.
I was able to apply when my custom function has only 1 input which is the grouped value.
I have dataframe like this:
a b c value
a1 b1 c1 v1
a2 b2 c2 v2
a3 b3 c3 v3
Appliable version:
def cpk(a):
arr = np.asarray(a)
arr = arr.ravel()
sigma = np.std(arr)
m = np.mean(arr)
Cpu = float(150 - m) / (3*sigma)
Cpl = float(m - 50) / (3*sigma)
Cpk = np.min([Cpu, Cpl])
return Cpk
df_cpk = df_result.groupby(['a','b','c'])['value'].agg(cpk).reset_index()
As you can see in the above code, the grouped 'value' automatically go to the input of the cpk
function.
What I want to know is how to apply below function:
def cpk2(a,lsl,usl):
arr = np.asarray(a)
arr = arr.ravel()
sigma = np.std(arr)
m = np.mean(arr)
Cpu = float(usl - m) / (3*sigma)
Cpl = float(m - lsl) / (3*sigma)
Cpk = np.min([Cpu, Cpl])
return Cpk
# df_cpk = df_result.groupby(['a','b','c'])['value'].agg(cpk2(?,?,?)).reset_index()
Where there are multiple inputs to the function, one being the group values. Is there any simple way to do it?
Since the two other inputs are constants, you can simply use a lambda expression:
df_cpk = df.groupby(['a','b','c'])['value'].agg(lambda x: cpk2(x, 50, 150)).reset_index()