I am new to regex and have been struggling for a while on this one: I want to transform LaTeX files to HTML.
I use mathjax to render equations and some javascript replace functions to convert the tags. I have nearly finished, but I still have an issue with the line breaks: I need to transform \\
to <br>
, but only outside the tags \begin{array}
and \end{array}
.
Example: in this portion, only the \\
before Montrer l'equivalence
should be replaced.
$M=\left(
\begin{array}{c|c}
A &B \\ \hline
C &D \\
\end{array}
\right)$
$in$ $\mathcal{M}_{n}(\mathbb{K})$ avec $A$ $\in$ $\mathcal{M}_{r}(\mathbb{K})$ inversible.\\ Montrer l'equivalence:
\[
\Bigl( rg(A) = rg(M) \Bigr) \Leftrightarrow \Bigl( D = CA^{-1}B \Bigr)
\]
\begin{enumerate}
\item Calculer $detB$ en fontion de $A$.
\item En déduire que $detB \geqslant 0$.
\end{enumerate}
$M=
\left(
\begin{array}{c|c}
A &B \\ \hline
C &D \\
\end{array}
\right)$
How can I do this with regex ?
EDIT: I have found here a handy regex tester...
You can use this pattern in a replace with a callback function that return the first capture group or <br>
when it is void:
/(\\begin{array}(?:[^\\]+|\\(?!end{array}))*\\end{array})|\\\\/
The idea is to match \begin{array}...\end{array}
before \\
to avoid to find \\
inside \begin{array}...\end{array}
.
detail:
(?: # open a non-capturing group
[^\\]+ # all characters but \ 1 or more times
| # OR
\\(?!end{array}) # \ not followed by "end{array}"
)* # close non-capturing group, zero or more times
This structure is more efficient than a simple .*?
that need many backtracks to succeed. It's a bit longer but more performant since it avoids lazy quantifiers.
(ps: remove the delimiters /
in regexpal)