Search code examples
xml-parsingxml-serializationjaxb2unmarshalling

Partial Unmarshalling of an XML using JAXB to skip some xmlElement


I want to unmarshal an XML file to java object using JAXB. The XML file is very large and contains some nodes which I want to skip in some cases to improve performance as these elements are non editable by client java program.

A sample XML is as follows:

<Example id="10" date="1970-01-01" version="1.0"> 
    <Properties>...</Properties>
    <Summary>...</Summary>
    <RawData>
        <Document id="1">...</Document>
        <Document id="2">...</Document>
        <Document id="3">...</Document>
        ------
        ------
    </RawData>
    <Location></Location>
    <Title></Title>
    ----- // more elements
</Example>

I have two use cases:

  • unmarshal into Example object which contains Properties, Summaries, RawData etc. without skipping any RawData. (already done this part)
  • unmarshal into Example object which exclude RawData. Elements nested in RawData is very large so do not want to read this in this use case.

Now I want to unmarshal the XML such that RawData can be skipped. I have tried the technique provided at this link.

Using technique provided in above link also skips all elements which come after RawData.


Solution

  • I have fixed the issue with XMLEventReader with following code:

    public class PartialXmlEventReader implements XMLEventReader {
    
    private final XMLEventReader reader;
    private final QName qName;
    private boolean skip = false;
    
    public PartialXmlEventReader(final XMLEventReader reader, final QName element) {
        this.reader = reader;
        this.qName = element;
    }
    
    @Override
    public String getElementText() throws XMLStreamException {
        return reader.getElementText();
    }
    
    @Override
    public Object getProperty(final String name) throws IllegalArgumentException {
        return reader.getProperty(name);
    }
    
    @Override
    public boolean hasNext() {
        return reader.hasNext();
    }
    
    @Override
    public XMLEvent nextEvent() throws XMLStreamException {
        while (isEof(reader.peek())) {
            reader.nextEvent();
        }
    
        return reader.nextEvent();
    }
    
    @Override
    public XMLEvent nextTag() throws XMLStreamException {
        return reader.nextTag();
    }
    
    @Override
    public XMLEvent peek() throws XMLStreamException {
        return reader.peek();
    }
    
    @Override
    public Object next() {
        return reader.next();
    }
    
    @Override
    public void remove() {
        reader.remove();
    }
    
    @Override
    public void close() throws XMLStreamException {
        reader.close();
    }
    
    private boolean isEof(final XMLEvent e) {
        boolean returnValue = skip;
        switch (e.getEventType()) {
        case XMLStreamConstants.START_ELEMENT:
            final StartElement se = (StartElement) e;
            if (se.getName().equals(qName)) {
                skip = true;
                returnValue = true;
            }
            break;
        case XMLStreamConstants.END_ELEMENT:
            final EndElement ee = (EndElement) e;
            if (ee.getName().equals(qName)) {
                skip = false;
            }
            break;
        }
        return returnValue;
    }
    

    }

    While Unmarshalling just pass this eventReader to the unmarshal method

    final JAXBContext context = JAXBContext.newInstance(classes);
        final Unmarshaller um = context.createUnmarshaller();
        Reader reader = null;
        try {
            reader = new BufferedReader(new FileReader(xmlFile));
            final QName qName = new QName("RawData");
            final XMLInputFactory xif = XMLInputFactory.newInstance();
            final XMLEventReader xmlEventReader = xif.createXMLEventReader(reader);
            final Example example =
                    (Example) um.unmarshal(new PartialXmlEventReader(xmlEventReader, qName));
            }
        } finally {
            IOUtils.closeQuietly(reader);
        }