Search code examples
pythonstringdataframesubstringextract

How do you extract a substring within column that contains people's Name & title that are in "Myles, Mr. Thomas Francis" format and only want "Mr."


enter image description hereenter image description hereWant to add matched results as new column of dataframe within a python function

I tried using re.search() expression

for i in input_df["Name"]:
   Title[i] = re.search(".$",i)

I get Type_error and not sure how to write pattern to get desired result


Solution

  • You could use str.extract here with the regex pattern \b[A-Z][a-z]+\.:

    input_df["Title"] = input_df["Name"].str.extract(r'\b([A-Z][a-z]+\.)')
    

    For a more sophisticated option, you could also use str.replace:

    input_df["Title"] = input_df["Name"].str.replace(r'^.*,\s+|\s+.*$', '', regex=True)