Search code examples
pythonpandaspython-re

Applying function to pandas dataframe column after str.findall


I have such a dataframe df with two columns:

Col1 Col2
'abc-def-ghi' 1
'abc-opq-rst' 2

I created a new column Col3 like this:

df['Col3'] = df['Col1'].str.findall('abc', flags=re.IGNORECASE) 

And got such a dataframe afterwards:

Col1 Col2 Col3
'abc-def-ghi' 1 [abc]
'abc-opq-rst' 2 [abc]

What I want to do now is to create a new column Col4 where I get a one if Col3 contains 'abc' and otherwise zero.

I tried to do this with a function:

def f(row):
    if row['Col3'] == '[abc]':
        val = 1
    else:
        val = 0
    return val

And applied this to my pandas dataframe:

df['Col4'] = df.apply(f, axis=1)

But I only get 0, also in column that contain 'abc'. I think there is something wrong with my if-statement. How can I solve this?


Solution

  • Just do

    df['Col4'] = df.Col3.astype(bool).astype(int)