I am trying to figure out how to capture the positive lookahead group in the following regex:
(((Initial commit)|(Merge [^\r\n]+)|(((build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test|BREAKING CHANGE)(\(\w+\))?!?: ([\w ]+))(\r|\n|\r\n){0,2}((?:\w|\s|\r|\n|\r\n)+)(?=(((\r|\n|\r\n){2}([\w-]+): (\w+))|$)))))
My sample dataset I am trying to match with is as follows:
#1
build(Breaking): la asdf asdf asdf
asdfasdf asdf asdf
asdf
asdf
asdf
asdf
asdf
asdf
aef asdf asdf
#2
build(Breaking): la asdf asdf asdf
asdfasdf asdf asdf
asdf
asdf
asdf
asdf
asdf
asdf
aef asdf asdf
asdf-asdf: asdf
I successfully capture all fields preceeding the positive lookahead of asdf-asdf: asdf
, whether or not it is there, but for some reason, even if the positive look-ahead finds the asdf-asdf: asdf
match, the capturing group doesn't seem to capture the asdf-asdf: asdf
match.
What should I be doing in order to accomplish this goal, or what am I doing wrong?
Your regex string is very long, but your problem is essentially that your positive lookahead is not being captured, because positive lookaheads do not capture itself. A simpler example is bad (?=tea)
which will not capture bad tea
and only bad
. However if you do bad (?=(tea))\1
it will indeed capture the entire string.
Your correct regex string is
(((Initial commit)|(Merge [^\r\n]+)|(((build|chore|ci|docs|feat|fix|perf|refactor|revert|style|test|BREAKING CHANGE)(\(\w+\))?!?: ([\w ]+))(\r|\n|\r\n){0,2}((?:\w|\s|\r|\n|\r\n)+)(?=(((\r|\n|\r\n){2}([\w-]+): (\w+))|$))\12)))
You simply add \12 (or just replicate whatever string is inside the positive lookahead) after the lookahead itself.