Search code examples
javaxmlparsingsoapsax

Java: parse xml file using SAX


I have this xml data to parse using SAX. The problem is that I cannot figure out how to get the data from it. The most important thing to get from it is the encoded data (fileContent) which I believe is base64. What I need to do with that is to make an Excel .xls file from it. I have tried some things, but I can only get some field/node (e.g. refNumber, userEmail, etc.) names, but not their actual value. I have placed some code snippets below. Could anyone please help me?

Thanks!

class SomeClass {
...
private String currentElement;
...
public Result parseSerializedData(String serializedData) throws SAXException, TransformerConfigurationException, TransformerException
    {
        System.out.println("-------------------");
        System.out.println("Serialized: " + serializedData);
        Source src = new SAXSource(xr, new InputSource(new StringReader(serializedData)));
        Result res = new StreamResult(System.out);
        System.out.println("Res 1:" + res);

        TransformerFactory.newInstance().newTransformer().transform(src, res);
        System.out.println("transform 1:" + res);

        try {
         SAXParserFactory factory = SAXParserFactory.newInstance();
         SAXParser saxParser = factory.newSAXParser();
         saxParser.parse(serializedData, new MyHandler());
          } catch (Exception e) {
             e.printStackTrace();
          }

        System.out.println("The current element is: " + currentElement);
        System.out.println("-------------------");
        return res;
    }

    /*
    * Inner class for the Callback Handlers.
    */
   class MyHandler extends DefaultHandler {
      // Callback to handle element start tag
      @Override
      public void startElement(String uri, String localName, String qName,
            Attributes attributes) throws SAXException {
          System.out.println("qName: " + qName);
         currentElement = qName;
      }

      // Callback to handle element end tag
      @Override
      public void endElement(String uri, String localName, String qName)
            throws SAXException {
         currentElement = "";
      }

      // Callback to handle the character text data inside an element
      @Override
      public void characters(char[] chars, int start, int length) throws SAXException {
        BASE64Decoder decoder = new BASE64Decoder();
          try {
            byte[] decodedBytes = decoder.decodeBuffer(String.valueOf(chars));
              System.out.println("The current element2 is: " + currentElement);
              if (currentElement.equals("fileContent")) {
                System.out.println("\tfileContent:\t" + new String(decodedBytes, start, length));
             }
          } catch (IOException e) {
              e.printStackTrace();  //To change body of catch statement use File | Settings | File Templates.
          }

      }
   }
}

serializedData is the contents of that xml file


Solution

  • Basically the characters method is where the values are read. In your case you were printing only for one tag. if (currentElement.equals("fileContent")). Follow the below program. This displays all values of all tags. Another thing to notice is the characters method reads a chuck of max 2048 bytes (if i remember correctly), so the best approach is to use append later process the value in endElement() method as shown in the example. Please not I'm using DatatypeConverter for Base64 decoding. You could use your own decoder.

    import java.io.File;
    
    import javax.xml.bind.DatatypeConverter;
    import javax.xml.parsers.SAXParser;
    import javax.xml.parsers.SAXParserFactory;
    
    import org.xml.sax.Attributes;
    import org.xml.sax.SAXException;
    import org.xml.sax.helpers.DefaultHandler;
    
    public class SaxSample {
    
        public static void main(String argv[]) {
    
            try {
                SAXParserFactory factory = SAXParserFactory.newInstance();
                SAXParser saxParser = factory.newSAXParser();
    
                DefaultHandler handler = new DefaultHandler() {
    
                    StringBuilder value;
    
                    public void startElement(String uri, String localName,
                            String qName, Attributes attributes)
                            throws SAXException {
                        value = new StringBuilder();
                    }
    
                    public void endElement(String uri, String localName,
                            String qName) throws SAXException {
                        if ("fileContent".equalsIgnoreCase(qName)) {
                            String decodedValue = new String(DatatypeConverter.parseBase64Binary(value.toString()));
                            System.out.println(qName + " = " + decodedValue);
                        } else {
                            System.out.println(qName + " = " + value);
                        }
                        value = new StringBuilder();
                    }
    
                    public void characters(char ch[], int start, int length)
                            throws SAXException {
                        value.append(new String(ch, start, length));
                    }
    
                };
    
                saxParser.parse(new File("data.xml"), handler);
            } catch (Exception e) {
                e.printStackTrace();
            }
    
        }
    
    }