I need to be able to select variable elements from the following ingredient list example.
I wish to collect the 'full word, space & •' in all instances.
INGREDIENTS: ALCOHOL DENAT. • FRAGRANCE (PARFUM) • WATER\AQUA\EAU • HYDROXYCITRONELLAL • LIMONENE • BENZYL BENZOATE • CITRONELLOL • GERANIOL • COUMARIN • FARNESOL • CITRAL • BENZYL ALCOHOL • CINNAMYL ALCOHOL • LINALOOL • ALCOHOL • DIPROPYLENE GLYCOL • ETHYLHEXYL METHOXYCINNAMATE • BUTYL METHOXYDIBENZOYLMETHANE • ETHYLHEXYL SALICYLATE • TRIS(TETRAMETHYLHYDROXYPIPERIDINOL) CITRATE • DILAURYL THIODIPROPIONATE • TOCOPHEROL • BHT • BENZOIC ACID • RED 4 (CI 14700) • EXT. VIOLET 2 (CI 60730) • YELLOW 6 (CI 15985) <ILN46472>
I have \b\w+\s•
but this is only selecting 'EAU •' within the copy, where as I need all instances within the list
DENAT. •
(PARFUM) •
EAU •
HYDROXYCITRONELLAL •
LIMONENE •
BENZOATE •
CITRONELLOL •
GERANIOL •
COUMARIN •
FARNESOL •
CITRAL •
ALCOHOL •
ALCOHOL •
LINALOOL •
ALCOHOL •
GLYCOL •
METHOXYCINNAMATE •
METHOXYDIBENZOYLMETHANE •
SALICYLATE •
CITRATE •
THIODIPROPIONATE •
TOCOPHEROL •
BHT •
ACID •
(CI 14700) •
(CI 60730) •
To get those matches, you might use:
(?:\([^()]*\)|\w+\.?)\s•
The pattern matches
(?:
Non capture group
\([^()]*\)
Match from (....)
|
Or\w+\.?
Match 1+ word chars followed by an optional .
)
Close the non capture group\s•
Match a whitespace char and •
See a regex demo
If there has to be at least a word character in between the parenthesis:
(?:\([^\w()]*\w[^()]*\)|\w+\.?)\s•
See another regex demo.
Note that \w
can also match _