I am trying to parse latex code from html code which looks like this:
string = " your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "
I want to replace all latex code with the output of a function that takes the latex code as an argument (Since there is a problem with finding the correct pattern, the function extract
returns an empty string for the moment).
I tried:
latex_end = "\)"
latex_start = "\("
string = re.sub(r'{}.*?{}'.format(latex_start, latex_end), extract, string)
Result:
your answer is wrong! Solution: based on \= 0 \) and \=0\) beeing ...
Expected:
your answer is wrong! Solution: based on and beeing ...
Any idea why it does not find the pattern? Is there a way to implement it?
You should use a raw string for your definition of string
since \v
is being interpreted as a special character.
import re
string = r" your answer is wrong! Solution: based on \((\vec{n_E},\vec{g})= 0 \) and \(d(g,E)=0\) beeing ... "
string = re.sub(r'\\\(.*?\\\)', '', string))
print(string)
Prints:
your answer is wrong! Solution: based on and beeing ...
If you need to have variables for the start and end:
latex_end = r"\\\)"
latex_start = r"\\\("
string = re.sub(r'{}.*?{}'.format(latex_start, latex_end), '', string)
print(string)