Search code examples
pythonregex

Regex to find words starting with capital letters not at beginning of sentence


I've managed to find the words beginning with capital Letters but can't figure out a regex to filter out the ones starting at the beginning of the sentence.

Each sentence ends with a full stop and a space.

  • Test_string = This is a Test sentence. The sentence is Supposed to Ignore the Words at the beginning of the Sentence.

  • Desired output = ['Test', 'Supposed', 'Ignore', 'Words', 'Sentence']

I'm coding in Python. Will be glad if someone can help me out with the regex :)


Solution

  • You may use the following expression:

    (?<!^)(?<!\. )[A-Z][a-z]+
    

    Regex demo here.


    import re
    mystr="This is a Test sentence. The sentence is Supposed to Ignore the Words at the beginning of the Sentence."
    
    print(re.findall(r'(?<!^)(?<!\. )[A-Z][a-z]+',mystr))
    

    Prints:

    ['Test', 'Supposed', 'Ignore', 'Words', 'Sentence']