The following expression is returning what I need, BUT is giving an extra empty match for each, as well as for any blank lines. This results in 5 valid text lines returning 10 matches. I expected it's in the way that I'm specifying the last capture group, or not making Capture Group #2 required.
How can I "ignore" the new line character (or whatever is triggering the extra match)
/(\d+[a-z]?\.)?[ ]?(.*)/g
11a. A numbered agenda item
Unnumbered agenda item
12. Another numbered agenda item
Another UNnumbered agenda item
13. A numbered agenda item
I need to extract the Agenda Item text, AND the preceding number (if present).
Demo at https://regex101.com/r/vB0H5s/1
In your pattern you are using quantifiers ?
and *
which are all optional, and can also match an empty string.
The reason you get 10 matches instead of 5 is that the pattern is unanchored. As all parts are optional, the last .*
can "match" the last position in the string.
You can use (.+)
to capture 1 or more characters in the second capture group.
If the match should be at the start of the string, you can use an anchor ^
^(\d+[a-z]?\.)?[ ]?(.+)
See a regex demo