Search code examples
pythonpandasregexlambdareplace

Values replacement in python pandas


I need to replace each cell containing values like number1(number2) with number2 (the value inside the parenthesis). For example: 56(3) -> 3, 33(5) -> 5

These values can appear in different columns.

The problem is that with pandas function

df.replace(to_replace=..., value=...)

i cannot use as value something that depends on the string matched.

I was trying something like:

df.replace(to_replace='[0-9]+([0-9]+)', value=lambda x: int(x.split("(")[1].strip(")")), regex=True)

But the lambda function doesn't work. Suggestions?


Solution

  • Have you tried df.apply instead?

    A caveat is that using apply on a dataFrame sends the entire row as input to the lambda function, so you will have to do something like this:

    for col in df.columns:
        df[col] = df[col].apply(<insert lambda function here>)