Search code examples
regexnegative-lookbehind

Negative lookahead preceded by .*


I want to select all text within {}, but only if there is no \status…{} in there.

Examples that should match:

\subsection{Hello}                -> "\subsection”, "Hello"
\section{Foobar}                  -> "\section", "Foobar"
\subsubsection{This is a Triumph} -> "\subsubsection", "This is a Triumph"

Examples that should not match:

\subsection{Hello\statusdone{}}
\section{Hello World\statuswip{}}
\section{Everything\statusproofreading{}}

I thought negative lookaheads would be perfect for this:

(\\.*section)\{(.*)(?!\\status.*)\}

but they match:

\subsection{Hello\statusdone{}}           -> "\subsection", "Hello\statusdone{}"
\section{Hello World\statuswip{}}         -> "\section", "Hello World\statuswip{}"
\section{Everything\statusproofreading{}} -> "\section", "Everything\statusproofreading{}"

I suspect it is because of the .* preceding the negative lookahead. If I replace it with, e.g.g, Hello in the following regex:

(\\.*section)\{(Hello)(?!\\status.*)\}

It correctly does not match the first negative example \subsection{Hello\statusdone{}}.

How do I work around that?


Solution

  • You should move the negative lookahead earlier in the pattern, so that it checks for the presence of that substring before the entire string (.*) is consumed.

    You can use:

    \\.*section\{((?!.*\\status.*\{\})[^}]+)}
    

    Live demo here.