I have an issue I'm trying to test a search feature within an xhtml document. The search should support Arabic and English text. I'm new to python and libxml2 so I have trouble figuring out how to do it.
I always get an empty result with Arabic text (in English it works perfectly), even though online tools such as http://www.freeformatter.com/xpath-tester.html#ad-output return the exact result I need.
import libxml2
doc = libxml2.parseFile("content.xhtml")
ctxt = doc.xpathNewContext()
xPathQuery = "//*[contains(text(), 'تجربة')]"
res = ctxt.xpathEval(xPathQuery)
doc.freeDoc()
ctxt.xpathFreeContext()
also using a Unicode string didn't work:
xPathQuery = u"//*[contains(text(), 'تجربة')]"
or even:
xPathQuery = u"//*[contains(text(), 'تجربة')]"
res = ctxt.xpathEval(xPathQuery.encode('utf-8'))
It turned out to be an issue with the code file encoding itself, I saved it in Unicode and it worked.