Search code examples
pythonpandaseval

Pandas eval - call user defined function on columns


As my question states, I would like to invoke custom function on run-time to a dataframe. Use of custom function will be to calculate difference between two date (i.e. age), convert year to months, find max-min from two columns etc.

So Far, I succeeded in performing arithmetic operations and few functions like abs(), sqrt() but couldn't get min()-max() working.Things working are,

df.eval('TT = sqrt(Q1)',inplace=True)
df.eval('TT1 = abs(Q1-Q2)',inplace=True)
df.eval('TT2 = (Q1+Q2)*Q3',inplace=True)

Following code works with eval. How can I use the same with dataframe eval ?

def find_max(x,y):
    return np.maximum(x,y)

eval('max1')(4,7)

def find_age(date_col1,date_col2):
    return 'I know how to calc age but how to call func this with df.eval and assign to new col'

Sample dataframe:

op_d = {'ID': [1, 2,3],'V':['F','G','H'],'AAA':[0,1,1],'D':['2019/12/04','2019/02/01','2019/01/01'],'DD':['2019-12-01','2016-05-31','2015-02-15'],'CurrentRate':[7.5,2,2],'NoteRate':[2,3,3],'BBB':[0,4,4],'Q1':[2,8,10],'Q2':[3,5,7],'Q3':[5,6,8]}
df = pd.DataFrame(data=op_d)

Any help or link to Doc is appreciated.

helpful links I found but not addressing my issues are:

Dynamic Expression Evaluation in pandas using pd.eval()

Using local variables with multiple assignments with pandas eval function

Passing arguments to python eval()


Solution

  • Functions can be called as usual, you need to reference them with the @ synbol:

    df                                                                  
       A  B
    0  1  0
    1  0  0
    2  0  1
    
    def my_func(x, y): return x + y                                     
    
    df.eval('@my_func(A, B)')                                          
    0    1
    1    0
    2    1
    dtype: int64
    

    Of course, the expectation here is that your functions expect series as arguments. Otherwise, wrap your function in a call to np.vectorize, as appropriate.