I have a problem making a perl regex to change \
character following these rules:
\(
\)
\
character in the previous matching sequence should be replaced with a double backslash \\
Example text reference:
Se la \probabilità dell'evento\ A è \(\frac{3}{4} \) e la
probabilità dell'evento B è \(\frac{1}{4}\)
\(\frac{3}{4} +\frac{3}{4}\) .
\(\frac{1}{4} - \frac{3}{4}\) .
\(\frac{3}{16}\) .
\(\frac{1}{2}\) .
Should become:
Se la \probabilità dell'evento\ A è \\(\\frac{3}{4} \\) e la
probabilità dell'evento B è \\(\\frac{1}{4}\\)
\\(\\frac{3}{4} +\\frac{3}{4}\\) .
\\(\\frac{1}{4} - \\frac{3}{4}\\) .
\\(\\frac{3}{16}\\) .
\\(\\frac{1}{2}\\) .
So far this is my best bet:
s/(\\\()(.*)(\\)(.*)(\\\))/\\\\\($2\\\\$4\\\\\)/mg
which produces:
Se la \probabilità dell'evento\ A è \\(\\frac{3}{4} \\) e la
probabilità dell'evento B è \\(\\frac{1}{4}\\)
\\(\frac{3}{4} +\\frac{3}{4}\\) .
\\(\frac{1}{4} - \\frac{3}{4}\\) .
\\(\\frac{3}{16}\\) .
\\(\\frac{1}{2}\\) .
As you can see
\\(\frac{3}{4} +\\frac{3}{4}\\) .
\\(\frac{1}{4} - \\frac{3}{4}\\) .
are wrong.
How can I modify my regex to accomodate my needs?
Posting an updated regex from my original.
The original had a validation at the end for all escapes.
After looking at it, it can be sped up by only doing the validation
one time when it finds the opening block.
At the bottom is a benchmark that compares the two methods.
Updated regex:
$str =~ s/(?s)(?:(?!\A)\G(?!\))[^\\]*\K\\|\\(?=\(.*?\\\)))/\\\\/g;
(?s) # Dot-All modifier
(?: # Cluster start
(?! \A ) # Not beginning of string
\G # G anchor - If matched before, start at end of last match
(?! \) ) # Last was an escape, so ')' ends the block
[^\\]* # Many non-escape's
\K # Previous is not part of match
\\ # A lone escape
| # or,
# New Block Check -
\\ # A lone escape then,
(?= # One time Validation:
\( # an opening '('
.*? # anything
\\ \) # then a final '\)'
) # -------------
) # Cluster end
Benchmark:
Sample \( \\\\\\\\\\\\\\\\\\\\\\\\\\\\\ \)
Results
New Regex: (?s)(?:(?!\A)\G(?!\))[^\\]*\K\\|\\(?=\(.*?\\\)))
Options: < none >
Completed iterations: 50 / 50 ( x 1000 )
Matches found per iteration: 31
Elapsed Time: 1.25 s, 1253.92 ms, 1253924 µs
Old Regex: (?s)(?:(?!\A)\G[^\\]*\K\\|\\(?=\())(?=.*?(?<=\\)\))
Options: < none >
Completed iterations: 50 / 50 ( x 1000 )
Matches found per iteration: 31
Elapsed Time: 3.95 s, 3952.31 ms, 3952307 µs