Search code examples
xmloracle-databasejdbcresultsetojdbc

OracleWebRowSet writeXml method fails to Escape Special characters like Ampersand &


OracleWebRowSet has a writeXml(FileWriter) method to convert a resultset to an XML file.

When used, it fails to escape the special characters like Ampersand and thus the generated XML file fails to conform to XML 1.0 standard

Though the default WebRowSet from rt.jar works just fine but there are specific reasons for me to use OracleWebRowSet

I tried StringEscapeUtils.EscapeXML10.translate() but it doesn't work like a rule but as a immediate string translator.

eg:

OracleWebRowSet owrs = new OracleWebRowSet();
FileWriter fWriter = = new FileWriter("file1.xml");
owrs.setEscapeProcessing(true);
//this is where resultset is converted to XML but not escaped properly
owrs.writeXml(fWriter);
fWriter.flush();

I'm in a bind... I might try to read the generated XML as a text file and escape the contents and write it back to the file... but that doesn't sound efficient when processing 700 xml files at a stretch

solutions? anyone?


Solution

  • I found a workaround to fix this... But I'm not sure if its the right way...

    here it goes...

    UPDATED:

    extend the java.io.FileWriter and override the write(String) method

    package customizations.java.io;
    import java.io.IOException;
    import java.util.regex.Matcher;
    import java.util.regex.Pattern;
    import org.apache.commons.lang3.StringEscapeUtils;
    public class XMLFileWriter extends java.io.FileWriter { 
        private Pattern html_prefix_pattern;
        private Pattern html_suffix_pattern;
        private Pattern common_tags_pattern1;
        private Pattern common_tags_pattern2;
        private Pattern common_tags_pattern3;
    
        public XMLFileWriter(String fileName) throws IOException {
            super(fileName);
            html_prefix_pattern = Pattern.compile("(?i)(.*)<[\\s]*html(.*)>(.*)", Pattern.DOTALL);
            html_suffix_pattern = Pattern.compile("(?i)(.*)<[\\s]*/html[\\s]*>(.*)", Pattern.DOTALL);
            common_tags_pattern1 = Pattern.compile("(.+)<[^/?](\"[^\"]*\"|'[^']*'|[^'\">])*[^?]>(.+)", Pattern.DOTALL);
            common_tags_pattern2 = Pattern.compile("^<[^/?](\"[^\"]*\"|'[^']*'|[^'\">])*[^?]>(.+)", Pattern.DOTALL);
            common_tags_pattern3 = Pattern.compile("(.+)<[^/?](\"[^\"]*\"|'[^']*'|[^'\">])*[^?]>$", Pattern.DOTALL);
        }
    
        @Override
        public void write(String str) throws IOException {
            Matcher html_prefixMatcher = html_prefix_pattern.matcher(str);
            Matcher html_suffixMatcher = html_suffix_pattern.matcher(str);
    
            boolean cdata_proc = false;
            //if(str.matches("(?i)(.*)[\\s]*<[\\s]*/html[\\s]*>[\\s]*(.*)")) {
            //for CLOB data in oracle table, html tags in content will violate the XMLWebRowSet Schema Structure. So enclose them in CDATA
    
            if(html_prefixMatcher.find()) {
                str = "<![CDATA["+str;
                cdata_proc = true;
            }
    
            if(html_suffixMatcher.find()) {
                str = str+"]]>";
                cdata_proc = true;
            }
    
            if(!cdata_proc) {
                Matcher common_tagsMatcher1 = common_tags_pattern1.matcher(str);
                Matcher common_tagsMatcher2 = common_tags_pattern2.matcher(str);
                Matcher common_tagsMatcher3 = common_tags_pattern3.matcher(str);
                if(str.matches("(.*)&(.*)") || common_tagsMatcher1.find() || common_tagsMatcher2.find() || common_tagsMatcher3.find()) {
                    str = StringEscapeUtils.ESCAPE_XML10.translate(str);
                }
            }
            super.write(str);
        }
    }
    

    so whenever the OracleWebRowset uses the write() method, our code kicks in and check if the text needs to be escaped... we need to limit the StringEscapeUtils or else, the XML tags will also be escaped resulting in an awkward xml file

    the modified code would look like:

    OracleWebRowSet owrs = new OracleWebRowSet();
    XMLFileWriter fWriter = = new XMLFileWriter("file1.xml");
    owrs.setEscapeProcessing(true);
    //this is where resultset is converted to XML but not escaped properly
    owrs.writeXml(fWriter);
    fWriter.flush();
    

    hope this helps anyone who stumbles across this issue... If this code needs to be perfected, post your suggestions guys