Suppose I have the following text:
Yes: [x]
Yes: [ x]
Yes: [x ]
Yes: [ x ]
No: [
No: ]
I am interested in capturing the angular brackets [
and ]
containing an x
with a variable amount of horizontal space on either side of the x
. The bit I am struggling with is that both angular brackets must be captured into a group with the same ID
(i.e., $1
).
I started with a combination of positive lookahead and lookbehind assertions using the following regex
:
\[(?=\h*x)|(?<=x)\h*\K\]
Which produces the following matches (i.e., see demo with the extended
flag enabled for clarity):
Then, I tried placing a capturing group around the whole expression, but the match extends to the horizontal space after the positive lookbehind (?<=x)\h*
as shown below (i.e., also see demo).
I am using Oniguruma regular expressions and the PCRE
flavor. Do you have any ideas if and how this can be done?
You could make use of a branch reset group:
(?|(\[)(?=\h*x\h*])|(?<=\[)\h*x\h*(]))
(?|
Branch reset group
(\[)(?=\h*x\h*])
Capture [
in group 1, asserting x
between optional horizontal whitespace chars to the right followed by ]
|
Or(?<=\[)\h*x\h*(])
Assert [
to the left, then match x
between optional horizontal whitespace and capture ]
in group 2)
Close branch reset group