My data contains 75 columns. I want to calculate below function for all columns seperatly and want to write a dataframe.
My data's columns,
df3.columns
Index(['R_26', 'R_31', 'R_38', 'R_65', 'R_71', 'R_86', 'R_25', 'R_63', 'R_59',
'R_19', 'R_35', 'R_84', 'R_24', 'R_68', 'S_15', 'R_85', 'R_57', 'R_22',
'R_30', 'R_15', 'R_16', 'R_69', 'S_16', 'R_6', 'R_87', 'R_40', 'R_20',
'R_17', 'R_18', 'R_21', 'R_28', 'S_9', 'R_33', 'R_56', 'S_10', 'R_7',
'S_8', 'R_29', 'R_1', 'R_66', 'S_18', 'S_6', 'R_64', 'R_34', 'R_37',
'R_3', 'R_54', 'R_67', 'S_22', 'R_13', 'R_48', 'S_11', 'R_58', 'S_23',
'S_3', 'S_4', 'R_60', 'S_7', 'R_32', 'S_5', 'R_51', 'R_8', 'R_10',
'R_9', 'S_14', 'R_62', 'S_17', 'S_21', 'R_14', 'R_55', 'R_2', 'R_50',
'R_49', 'R_53', 'FRAUD'],
dtype='object')
My function (doing for just 1 sample 'R_26');
df4 = df3[df3.R_26 == 1]
Sm = df4.R_26.sum()
Fr = df4.FRAUD.sum()
Rate = b / a
As i want data frame sample;
Column Rate
R_26 0.15 R_31 0.45 . . . . . .
You can use DataFrame.melt
for unpivot, then filter 1
values by DataFrame.query
, aggregate sum
, create Rate
column by DataFrame.assign
with Series.div
for divide and DataFrame.pop
for extract columns and last convert index to column by DataFrame.reset_index
:
df = (df3.melt('FRAUD')
.query('value == 1')
.groupby('variable')
.sum()
.assign(Rate = lambda x: x.pop('FRAUD').div(x.pop('value')))
.reset_index())
print (df)