Search code examples
javaxmlsax

Java use sax to parse xml files. Can't get the correct content when coming up &amp


I have some issues with parsing xml files by sax.

The Java contenthandler code looks like this:

boolean rcontent = false;

@Override
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
    if (qName.equalsIgnoreCase("content")) {
        rcontent = true;
    }
}

@Override
public void characters(char ch[], int start, int length) throws SAXException {
    if (rcontent){
        System.out.println("content: " + new String(ch, start, length));
        rcontent = false;
    }
}

Xml file content is like this: enter image description here

But the output is:

I want to say

which is not complete.


Solution

  • It's likely that characters(...) is being called multiple times for the single <content> block. Try something like

    StringBuilder builder;
    
    @Override
    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        if (qName.equalsIgnoreCase("content")) {
            builder = new StringBuilder();
        }
    }
    
    @Override
    public void characters(char ch[], int start, int length) throws SAXException {
        if (builder != null){
            builder.append(new String(ch, start, length));
        }
    }
    
    @Override
    public void endElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        if (builder != null) {
            System.out.println("Content = " + builder);
            builder = null;
        }
    }