Search code examples
pythonpandas

How to map a function using multiple columns in pandas?


I've checked out map, apply, mapapply, and combine, but can't seem to find a simple way of doing the following:

I have a dataframe with 10 columns. I need to pass three of them into a function that takes scalars and returns a scalar ...

some_func(int a, int b, int c) returns int d

I want to apply this and create a new column in the dataframe with the result.

df['d'] = some_func(a = df['a'], b = df['b'], c = df['c'])

All the solutions that I've found seem to suggest to rewrite some_func to work with Series instead of scalars, but this is not possible as it is part of another package. How do I elegantly do the above?


Solution

  • Use pd.DataFrame.apply(), as below:

    df['d'] = df.apply(lambda x: some_func(a = x['a'], b = x['b'], c = x['c']), axis=1)
    

    NOTE: As @ashishsingal asked about columns, the axis argument should be provided with a value of 1, as the default is 0 (as in the documentation and copied below).

    axis : {0 or ‘index’, 1 or ‘columns’}, default 0

    • 0 or ‘index’: apply function to each column
    • or ‘columns’: apply function to each row