Search code examples
pythonpython-3.xregexpandasextract

Python Regex for a Specific Word


I have the below texts in a column called actions.

"Why don't you clean this table : J$CLAB"
"http("J$MANG.create"): 21/01/06 23:24:05 INFO"

i would like to extract the words that start with J$... till the end. e.g. J$MANG & add it in a new column.

here is what i have done so far, not working as needed

file['fileName'] = [re.split(r'[^J$A-Z\.$]|[^J$A-Z\s$]', val.strip()) for val in file['action']]
file['fileName']  = [' '.join(val) for val in file['fileName']]

any suggestions. Thx


Solution

  • You can use

    file['fileName']  = file['action'].str.extract(r'\b(J\$\w*)', expand=False) 
    

    See the regex demo

    Details:

    • \b - a word boundary
    • (J\$\w*) - Group 1: J and zero or more word chars.