Search code examples
pythonpandasdataframeapply

Applying a function to DataFrame-rows based on column-names


Let' say I have a DataFrame with a lot of columns df, and different functions fn(...) that can be applied to those columns to compute different results. Hereby, the column names fit the arguments of the functions:

import pandas as pd

def f1(A, C):
    return(A + C)

def f2(B, C):
    return(C - B)

df = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]], columns = ['A','B','C'])

To apply a function to each row, I can use the following line of code:

df['result'] = df.apply(lambda row: f1(row['A'], row['C']), axis = 'columns')

However, as different functions take different input arguments, it is kind of annoying to specify them explicitly, so I am looking for a way so that apply recognizes which value to pass to the function based on the column name. In R, the above line is much shorter and I find this extremly convenient:

df$result = pmap(df, f1)

Is there anything similar in Python?

Btw, I know that apply is not recommended, but the functions are so complex that they are not vertorizable anyways (at least for me), and the above function-definition makes debugging much easier.


Solution

  • Following chrslg's answer, I think the shortest way to accomplish this is to unpack row and rewrite f1 such that unused arguments are passed to **kwargs:

    import pandas as pd
    
    def f1(A, C, **kwargs):
        return(A + C)
    
    df = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]], columns = ['A','B','C'])
    
    df['result'] = df.apply(lambda row: f1(**row), axis = 'columns')