Search code examples
pandasdataframequerying

Select rows from dataframe with a certain character after a '-'


I have the following DataFrame:

import pandas as pd
test = pd.DataFrame({'A': 'A1-C-D-1 A22-C-D-22 A4-S-E-3'.split(),
             'B': [1, 2, 3]})

I want to select the rows that have a certain character (for example 'E') after the second '-'

Any ideas would be very welcome!


Solution

  • Option 1
    Filter with str.split + str.startswith:

    test[test.A.str.split('-').str[2].str.startswith('E')]
    
              A  B
    2  A4-S-E-3  3
    

    Option 2
    You can get inventive and use str.extract + pd.Series.notna/notnull here:

    test[test.A.str.extract('.*?-.*?-(E).*', expand=False).notna()]
    
              A  B
    2  A4-S-E-3  3