I would like to use regex in php to separate words and phrases out of a string. The phrases would be separated by quotes, both double and single. The regular expression would also have to take in consideration single quotes within words (ie nation's).
Example string:
The nation's economy 'is really' poor, but "might be getting" better.
I would like php to separate this type of string into an array using a regex as follows:
Array
(
[0] => "The"
[1] => "nation's"
[2] => "economy"
[3] => "is really"
[4] => "poor"
[5] => "but"
[6] => "might be getting"
[7] => "better"
)
What would the php code be to accomplish this?
Use preg_match_all
on the regex:
(?<![\w'"])(?:['"][^'"]+['"]|[\w']+)(?![\w'"])
Example: https://3v4l.org/vBGY7
preg_match_all(
'/(?<![\w\'"])(?:[\'"][^\'"]+[\'"]|[\w\']+)(?![\w\'"])/',
"The nation's economy 'is really' poor, but \"might be getting\" better.",
$matches
);
print_r($matches[0]);
(Note that this doesn't recognize hy-phe-nat-ed words as it is not specified in the question.)
Output (containing quote wrappings):
Array
(
[0] => The
[1] => nation's
[2] => economy
[3] => 'is really'
[4] => poor
[5] => but
[6] => "might be getting"
[7] => better
)