I have some text:
\frac{A}{B}
i need to transform this text to form:
<mfrac>
<mrow>
A
</mrow>
<mrow>
B
</mrow>
</mfrac>
I have to use Python, and regex. A
and B
can be further fractions, so function must be recursive, for example text:
\frac{1+x}{1+\frac{1}{x}}
must change into
<mfrac>
<mrow>
1+x
</mrow>
<mrow>
1+
<mfrac>
<mrow>
1
</mrow>
<mrow>
x
</mrow>
</mfrac>
</mrow>
</mfrac>
please help with regex :)
If you need to match recursive pattern in default python re module, you can do like me for recursive comments I build recently for css preprocessor.
Generally use re just for splitting text to tokens and then use loops with nesting level variable to find all syntax. Here is my code:
COMMENTsRe = re.compile( r"""
// |
\n |
/\* |
\*/
""", re.X )
def rm_comments( cut ):
nocomment = 0 # no inside comment
c = 1 # c-like comments, but nested
cpp = 2 # c++like comments
mode = nocomment
clevel = 0 # nesting level of c-like comments
matchesidx = []
# in pure RE we cannot find nestesd structuries
# so we are just finding all boundires and parse it here
matches = COMMENTsRe.finditer( str(cut) )
start = 0
for i in matches:
m = i.group()
if mode == cpp:
if m == "\n":
matchesidx.append( ( start, i.end()-1 ) ) # -1 because without \n
mode = nocomment
elif mode == c:
if m == "/*":
clevel += 1
if m == "*/":
clevel -= 1
if clevel == 0:
matchesidx.append( ( start, i.end() ) )
mode = nocomment
else:
if m == "//":
start = i.start()
mode = cpp
elif m == "/*":
start = i.start()
mode = c
clevel += 1
cut.rm_and_save( matchesidx )