Search code examples
pythonpandasapply

How to create a pandas function that tests if two string exist in multiple panda columns


I'm trying to write a function that scans through a set of columns and tests if two conditions exist across two different column rows. I'll then use this function and apply it my dataframe.

pattern_one = 'Test1|Exam12'

df = pd.DataFrame({'Project Code':['32132','212132'],'Task Name':['Test 1','Test13']})

def data_filter(project_code, task_name):
    
    if (project_code.str.contains('32132').any() & 
    task_name.str.contains(pattern_one, regex=True).any()):

    output = '45%'
    return(output)
    
    elif:

    output = '0%'
    return(output)

Dataframe looks like this:

Project Code Task Name
32132 Test1
212132 Test13

Solution

  • You don't need a function: use np.where with the conditions and if-true and if-false values.

    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame({"Project Code": ["32132", "212132"],
                       "Task Name" : ["Test1", "Test13"]
                       })
    
    df['out'] = np.where((df["Project Code"].str.contains('32132')) &
                         (df["Task Name"].str.contains('Test1|Exam12')), "45%", "0%")
    
    print(df)
    

    gives:

      Project Code Task Name  out
    0        32132     Test1  45%
    1       212132    Test13   0%