How can I identify sentences within a text?

I have text that looks like this:-

"I am an engineer. I am skilled in ASP.NET. I also know Node.js.But I don't have much experience. "

Here, "ASP.NET" and "Node.js" are to be treated as words. Also, there is no space before "But I...", but it should be treated as a separate sentence.

The expected output is:

["I am an engineer"," I am skilled in ASP.NET","I also know Node.js","But I don't have much experience"]

Is there a way of doing this?

Solution

For your current input you may use the following approach with re.split() function and specific regex pattern:

import re

s = "I am an engineer. I am skilled in ASP.NET. I also know Node.js.But I don't have much experience. "
result = re.split(r'\.(?=\s?[A-Z][^.]*? )', s)

print(result)

The output:

['I am an engineer', ' I am skilled in ASP.NET', ' I also know Node.js', "But I don't have much experience. "]

(?=\s?[A-Z][^.]*? ) - lookahead positive assertion, ensures that sentence delimiter . is followed by word from next sentence