Search code examples
pythonpandasparsingeval

Using user defined functions in pandas eval


I'm trying to use my custom function within pandas eval. It works properly for limited use:

basic_df = DataFrame({"A":[1,2,3,4,5],"B":[20,40,60,100,90],
                               "C":["C1","C2","C3","C4","C5"],
                               })
def str_parse(element) -> str:
    return str(element)

print(basic_df.eval("@str_parse(A+B+100)"))

But whenever I want to add some static string (to add string to string), it returns following result:

basic_df.eval("@str_parse(A+B+100) + \"additional string\"",)

0    121
1    142
2    163
3    204
4    195
dtype: int64additional string.

How can i add string to string within creating additional column?


Solution

  • First, return a new series with string type from str_parse() (not just string representation of Series). Then you can use .__add__() to add additional string (for some reason, simple + doesn't work):

    basic_df = pd.DataFrame(
        {
            "A": [1, 2, 3, 4, 5],
            "B": [20, 40, 60, 100, 90],
            "C": ["C1", "C2", "C3", "C4", "C5"],
        }
    )
    
    
    def str_parse(series):
        return series.astype(str)
    
    
    print(basic_df.eval("new_col = @str_parse(A+B+100).__add__('additional string')"))
    

    Prints:

       A    B   C               new_col
    0  1   20  C1  121additional string
    1  2   40  C2  142additional string
    2  3   60  C3  163additional string
    3  4  100  C4  204additional string
    4  5   90  C5  195additional string