I want to detect if a long text string (input from "somewhere") contains mathematical expressions encoded in LaTeX. This means searching for substrings (denoted ...
in what follows) enclosed inside either of:
$...$
\[...\]
\(...\)
\begin{displaymath} ... \end{displaymath}
There are some variations of item 3 with other keywords than displaymath
, and there may be a whitespace inside the brace, etc., but I suppose I can figure out the rest once I get (1), (2), (3) working.
For (1), I suppose I can do the following:
import re
if re.search(r"$(\w+)$", str):
(do something)`
But I am having problems with the others, especially when it has the \
. Help would be appreciated.
The python version should be 2.7.12 but ideally code that works for both versions 2.x and 3.x will be preferred.
You need to escape \
,[
,]
,{
,}
,(
,)
as they have special meaning in regular expression.
So, you need to add an extra \
before them, when you want to match them literally.
For your second pattern, use:
\\\[(.+?)\\\]
For third pattern, use:
\\\((.+?)\\\)
For fourth pattern,
\\begin\{displaymath\}(.+?)\\end\{displaymath\}
You can see the demo for the fourth pattern here.