Search code examples
selenium-webdriverxpathunicode

Xpath how find element containing text containing unicode?


I am e.g. trying to find something like this on an international websites:

unicode: "\u202aDansk\u202c\u200f"
html: ‪Dansk‬‏

This doesn't work: //*[contains(text(),'‪Dansk‬‏')]


Solution

  • XPath itself does not provide any way of escaping non-ASCII characters. But the host language in which XPath strings are written often does.

    When XPath expressions are written as string literals in a programming language such as Java, C#, Python, or Javascript, you can generally use backslash escaping:

    xpath.evaluate("contains(., '\u202aDansk\u202c\u200f')")
    

    When XPath expressions are written as attributes in an XML-based language such as XSLT or XSD, you can use ampersand escaping:

    select="contains(., '‪Dansk‬‏')"
    

    In any other context, you'll need to check the specs for your host language environment.