Search code examples
python-3.xstringpandastry-catchempty-list

skipping empty list and continuing with function


Background

import pandas as pd
Names =    [list(['Jon', 'Smith', 'jon', 'John']),
               list([]),
               list(['Bob', 'bobby', 'Bobs'])]
df = pd.DataFrame({'Text' : ['Jon J Smith is Here and jon John from ', 
                                       '', 
                                       'I like Bob and bobby and also Bobs diner '], 

                          'P_ID': [1,2,3], 
                          'P_Name' : Names

                         })

#rearrange columns
df = df[['Text', 'P_ID', 'P_Name']]
df

    Text                                      P_ID  P_Name
0   Jon J Smith is Here and jon John from       1   [Jon, Smith, jon, John]
1                                               2   []
2   I like Bob and bobby and also Bobs diner    3   [Bob, bobby, Bobs]

Goal

I would like to use the following function

df['new']=df.Text.replace(df.P_Name,'**BLOCK**',regex=True) 

but skip row 2, since it has an empty list []

Tried

I have tried the following

try:
    df['new']=df.Text.replace(df.P_Name,'**BLOCK**',regex=True) 
except ValueError:
    pass

But I get the following output

                        Text                P_ID    P_Name
0   Jon J Smith is Here and jon John from       1   [Jon, Smith, jon, John]
1                                               2   []
2   I like Bob and bobby and also Bobs diner    3   [Bob, bobby, Bobs]

Desired Output

   Text P_ID P_Name  new
0                     `**BLOCK**` J `**BLOCK**` is Here and `**BLOCK**` `**BLOCK**` from
1                      []  
2                      I like `**BLOCK**` and `**BLOCK**` and also `**BLOCK**` diner

Question

How do I get my desired output by skipping row 2 and continuing with my function?


Solution

  • Locate the rows which do not have an empty list and use your replace method only on those rows:

    # Boolean indexing the rows which do not have an empty list
    m = df['P_Name'].str.len().ne(0)
    
    df.loc[m, 'New'] = df.loc[m, 'Text'].replace(df.loc[m].P_Name,'**BLOCK**',regex=True)  
    

    Output

                                            Text  P_ID                   P_Name                                                  New
    0     Jon J Smith is Here and jon John from      1  [Jon, Smith, jon, John]  **BLOCK** J **BLOCK** is Here and **BLOCK** **BLOCK** from 
    1                                       Test     2                       []                                                  NaN
    2  I like Bob and bobby and also Bobs diner      3       [Bob, bobby, Bobs]  I like **BLOCK** and **BLOCK** and also **BLOCK**s diner