Search code examples
javaregexxmldom4j

DOM4J Read Regex BUG


I am currently using dom4j to read regex pattern from a xml but the matches function always return false even through the string return is exactly the same what I have typed in the xml.

The Sample XML:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <username-pattern><![CDATA[^\\w+$]]></username-pattern>
</root>

Example Code(Not working):

public class Test 
{
    public static void main( String[] args ) throws DocumentException
    {
        File inputFile = new File("C://test/test.xml");
        SAXReader saxBuilder = new SAXReader();
        Document document = saxBuilder.read(inputFile);
        Element rootElement = document.getRootElement();  

        String username_pattern_notwork = rootElement.elementTextTrim("username-pattern");
        String username_pattern_work = "^\\w+$";
        if(!(Pattern.matches(username_pattern_notwork, "ooo"))){
            System.out.println("===============Err=========");
        }else{
            System.out.println("OK");
        }

    }
}

If you replace the variable username_pattern_notwork with username_pattern_work in Pattern.matches(username_pattern_notwork, "ooo"), it will work as normal


Solution

  • After reading through the source code, the string read from xml by dom4j has already been escaped.

    That means I dont need to escape the string in xml file manually. i.e.

    <?xml version="1.0" encoding="UTF-8"?>
    <root>
        <username-pattern><![CDATA[^\w+$]]></username-pattern>
    </root>