Search code examples
pythonpandaslambdapython-applymap

Pandas applymap method with passing column name as parameter


I want to use applymap method with a little bit complex function in the dataset below.

 value1 value2 value3 value4 value5  people

   147    119     69     92    106   533.0
    31     20     12     14     26   103.0
    37     22     24     18     19   120.0
    10     13      7     13     10    53.0
    38     48     18     30     27   161.0
   401    409    168    354    338  1670.0
   109     92     55     82     69   407.0
     5      9      7     11      9    41.0
    44     36     21     48     28   177.0
    59     40     19     38     27   183.0
     8      9      1      7     10    35.0

People column represents sum of the value columns. I want to replace the value numbers with percentages of them. For example: In first row value1 is 147 and sum of the values in first row is 533. I want to replace 147 with (147/533)*100

I think it looks like this. but i couldn't make it work.

df.loc[:, 'value1':'value5'] = df.loc[:, 'value1':'value5'].applymap(lambda x: (x / df['people'])*100)

Solution

  • Function applymap is used for process each value of DataFrame elemenwise.

    Better is use vectorized solution with DataFrame.div:

    df.loc[:, 'value1':'value5'] = df.loc[:, 'value1':'value5'].div(df['people'], axis=0) * 100
    print (df)
           value1     value2     value3     value4     value5  people
    0   27.579737  22.326454  12.945591  17.260788  19.887430   533.0
    1   30.097087  19.417476  11.650485  13.592233  25.242718   103.0
    2   30.833333  18.333333  20.000000  15.000000  15.833333   120.0
    3   18.867925  24.528302  13.207547  24.528302  18.867925    53.0
    4   23.602484  29.813665  11.180124  18.633540  16.770186   161.0
    5   24.011976  24.491018  10.059880  21.197605  20.239521  1670.0
    6   26.781327  22.604423  13.513514  20.147420  16.953317   407.0
    7   12.195122  21.951220  17.073171  26.829268  21.951220    41.0
    8   24.858757  20.338983  11.864407  27.118644  15.819209   177.0
    9   32.240437  21.857923  10.382514  20.765027  14.754098   183.0
    10  22.857143  25.714286   2.857143  20.000000  28.571429    35.0
    

    Another numpy solution with broadcasting:

    df.loc[:, 'value1':'value5'] = (df.loc[:, 'value1':'value5'].values / 
                                         df['people'].values[:, None] * 100)
    print (df)
           value1     value2     value3     value4     value5  people
    0   27.579737  22.326454  12.945591  17.260788  19.887430   533.0
    1   30.097087  19.417476  11.650485  13.592233  25.242718   103.0
    2   30.833333  18.333333  20.000000  15.000000  15.833333   120.0
    3   18.867925  24.528302  13.207547  24.528302  18.867925    53.0
    4   23.602484  29.813665  11.180124  18.633540  16.770186   161.0
    5   24.011976  24.491018  10.059880  21.197605  20.239521  1670.0
    6   26.781327  22.604423  13.513514  20.147420  16.953317   407.0
    7   12.195122  21.951220  17.073171  26.829268  21.951220    41.0
    8   24.858757  20.338983  11.864407  27.118644  15.819209   177.0
    9   32.240437  21.857923  10.382514  20.765027  14.754098   183.0
    10  22.857143  25.714286   2.857143  20.000000  28.571429    35.0
    

    If want something similar like applymap is possible use apply, but solutions above are faster:

    df.loc[:, 'value1':'value5'] = )df.loc[:, 'value1':'value5']
                                       .apply(lambda x: (x / df['people'])*100))
    print (df)
           value1     value2     value3     value4     value5  people
    0   27.579737  22.326454  12.945591  17.260788  19.887430   533.0
    1   30.097087  19.417476  11.650485  13.592233  25.242718   103.0
    2   30.833333  18.333333  20.000000  15.000000  15.833333   120.0
    3   18.867925  24.528302  13.207547  24.528302  18.867925    53.0
    4   23.602484  29.813665  11.180124  18.633540  16.770186   161.0
    5   24.011976  24.491018  10.059880  21.197605  20.239521  1670.0
    6   26.781327  22.604423  13.513514  20.147420  16.953317   407.0
    7   12.195122  21.951220  17.073171  26.829268  21.951220    41.0
    8   24.858757  20.338983  11.864407  27.118644  15.819209   177.0
    9   32.240437  21.857923  10.382514  20.765027  14.754098   183.0
    10  22.857143  25.714286   2.857143  20.000000  28.571429    35.0