Search code examples
pandasloopsfor-loopiterationcalculated-columns

Same function for all columns in DataFrame


My data contains 75 columns. I want to calculate below function for all columns seperatly and want to write a dataframe.

My data's columns,

df3.columns

Index(['R_26', 'R_31', 'R_38', 'R_65', 'R_71', 'R_86', 'R_25', 'R_63', 'R_59',
   'R_19', 'R_35', 'R_84', 'R_24', 'R_68', 'S_15', 'R_85', 'R_57', 'R_22',
   'R_30', 'R_15', 'R_16', 'R_69', 'S_16', 'R_6', 'R_87', 'R_40', 'R_20',
   'R_17', 'R_18', 'R_21', 'R_28', 'S_9', 'R_33', 'R_56', 'S_10', 'R_7',
   'S_8', 'R_29', 'R_1', 'R_66', 'S_18', 'S_6', 'R_64', 'R_34', 'R_37',
   'R_3', 'R_54', 'R_67', 'S_22', 'R_13', 'R_48', 'S_11', 'R_58', 'S_23',
   'S_3', 'S_4', 'R_60', 'S_7', 'R_32', 'S_5', 'R_51', 'R_8', 'R_10',
   'R_9', 'S_14', 'R_62', 'S_17', 'S_21', 'R_14', 'R_55', 'R_2', 'R_50',
   'R_49', 'R_53', 'FRAUD'],
  dtype='object')

My function (doing for just 1 sample 'R_26');

df4 = df3[df3.R_26 == 1]
Sm = df4.R_26.sum()
Fr = df4.FRAUD.sum()
Rate = b / a

As i want data frame sample;


Column Rate


R_26 0.15 R_31 0.45 . . . . . .


Solution

  • You can use DataFrame.melt for unpivot, then filter 1 values by DataFrame.query, aggregate sum, create Rate column by DataFrame.assign with Series.div for divide and DataFrame.pop for extract columns and last convert index to column by DataFrame.reset_index:

    df = (df3.melt('FRAUD')
            .query('value == 1')
            .groupby('variable')
            .sum()
            .assign(Rate = lambda x: x.pop('FRAUD').div(x.pop('value')))
            .reset_index())
    print (df)