I cannot find any solution to this. Please help. I need to split this "paragraph" into sentences array:
$paragraph = "a. b. c. hello o.c.. hello world -in.. hello. world. 8.5 hello world. ";
The resulting array should look like:
0=>a.
1=>b.
2=>c.
3=>hell o.c.
4=>hello world -in.
5=>hello.
6=>world.
7=>8.5 hello world.
I got this far
preg_split('/(?<=[.?!;:])\s+/', $sentence, -1, PREG_SPLIT_NO_EMPTY);
But this does not allow a decimal number.
You can use (*SKIP)(*FAIL)
to tell the regex to not match if the preceding match matches. So
(in|o\.c)\.\h+(*SKIP)(*FAIL)|(?<=[.?!])\s+
Should tell the regex to not match if in.
or o.c.
is matched. Otherwise split on .
, !
, or ?
and a space.
PHP Demo: https://eval.in/542856
Regex101 Demo: https://regex101.com/r/eS0tR7/1