I have the following DataFrame:
import pandas as pd
test = pd.DataFrame({'A': 'A1-C-D-1 A22-C-D-22 A4-S-E-3'.split(),
'B': [1, 2, 3]})
I want to select the rows that have a certain character (for example 'E') after the second '-'
Any ideas would be very welcome!
Option 1
Filter with str.split
+ str.startswith
:
test[test.A.str.split('-').str[2].str.startswith('E')]
A B
2 A4-S-E-3 3
Option 2
You can get inventive and use str.extract
+ pd.Series.notna/notnull
here:
test[test.A.str.extract('.*?-.*?-(E).*', expand=False).notna()]
A B
2 A4-S-E-3 3