Search code examples
javaxmlxml-parsingwoodstox

woodstox skip part of xml


Java: 1.6
Woodstox: 4.1.4

I just want to skip part of xml file, while parsing. Let's look at that simple xml:

<family>
    <mom>
        <data height="160"/>
    </mom>
    <dad>
        <data height="175"/>
    </dad>
</family>

I just want do skip dad element. So it look's like using skipElement method like shown below is a good idea:

FileInputStream fis = ...;
XMLStreamReader2 xmlsr = (XMLStreamReader2) xmlif.createXMLStreamReader(fis);

String currentElementName = null;
while(xmlsr.hasNext()){
            
    int eventType = xmlsr.next();
                        
    switch(eventType){
            
        case (XMLEvent2.START_ELEMENT):
            currentElementName = xmlsr.getName().toString();
                    
            if("dad".equals(currentElementName) == true){
                logger.info("isStartElement: " + xmlsr.isStartElement());
                logger.info("Element BEGIN: " + currentElementName);
                xmlsr.skipElement();
            }

                    ...
    }
}

We just find start of element dad, and skip it. But not so fast, because Exception will be thrown. This is the output:

isStartElement: true
Element BEGIN: dad
Exception in thread "main" java.lang.IllegalStateException: Current state not START_ELEMENT

That is not what expected. This is indeed very unexpected, because method skipElement is executed in START_ELEMENT state. What is going on?


Solution

  • I've found the reason, why I was getting the IllegalStateException. The very useful was flup's answer. Thanks a lot.
    It is worth to read answer given by Blaise too.

    But getting to the heart of the matter. The problem was not skipElement() method itself. The problem was caused becouse of methods used to read attributes. There are three dots (...) in my question. So let's look what was there:

    switch(eventType){
    
    case (XMLEvent2.START_ELEMENT):
        currentElementName = xmlsr.getName().toString();
        logger.info("currentElementName: " + currentElementName);
    
    
        if("dad".equals(currentElementName) == true){
            logger.info("isStartElement: " + xmlsr.isStartElement());
            logger.info("Element BEGIN: " + currentElementName);
            xmlsr.skipElement();
        }
    
    
        case (XMLEvent2.ATTRIBUTE):
            int attributeCount = xmlsr.getAttributeCount(); 
            ...
            break;
    
    
    }
    

    Important thing. There is no break statement for START_ELEMENT. So every time START_ELEMENT event occurs the code for event ATTRIBUTE is also executed. That looks OK according to Java Docs, becouse methods getAttributeCount(), getAttributeValue() etc. can be executed for both START_ELEMENT and ATTRIBUTE.

    But after calling method skipElement(), event START_ELEMENT is changed to END_ELEMENT. So calling method getAttributeCount() is not allowed. This call is the reason why IllegalStateException is thrown.

    The simplest way to avoid that Exception is just calling break statement after calling skipElement() method. In that case code for getting attributes will not be executed, thus Exception will not be thrown.

            if("dad".equals(currentElementName) == true){
                logger.info("isStartElement: " + xmlsr.isStartElement());
                logger.info("Element BEGIN: " + currentElementName);
                xmlsr.skipElement();
                break;                  //the cure for IllegalStateException
            }
    

    I'm sorry I gave you no chance to answer my original question becouse of to much code hidden.