The current REGEX I'm using is the following one:
var sentences = fulltext.match(/[^\.!\?]+[\.!\?]+/g);
That returns an array with the sentences split INCLUDING the spaces (I need all the characters). Problem is, it does not work with ellipsis "..." and I guess neither it does with other unconventional forms of punctuation.
How can I fix my REGEX to match this and other forms of punctuation?
Is there any noob friendly example driven guide to REGEX out there?
Unicode of ellipsis is \u2026
.
So you can use \u2026
to match an ellipsis .
Code :
var fulltext= "First sentence… Second sentence. ";
fulltext.match(/([^.?!;\u2026]+[.?!;\u2026]+)/g);
OUTPUT
["First sentence…", " Second sentence."]