Search code examples
stax

StAX Parser : Duplicated Node name and specific comments


I'm try to parse xml file with StAX parser but I face two problems: First: Two nodes have the same name Second: read the exactly comment before the values

<database>
<!-- 2015-03-10 01:29:00 EET / 130 --> <row><v> 2.74 </v><v> 1.63 </v></row>
<!-- 2015-03-10 01:30:00 EET / 170 --> <row><v> 5.33 </v><v> 1.68 </v></row>
<!-- 2015-03-10 01:31:00 EET / 180 --> <row><v> 7.62 </v><v> 1.83 </v></row>
<database>

I want to collect the data like that:

Date:2015-03-10 01:29:00

V1: 2.74

V2:1.63

I was using Dom parser before and it was so easy to deal with dublicate node name and comments unfortunately I have to use StAX now and I don't know how to solve those problems :(


Solution

    1. The first issue: two nodes have the same name
    <v> 2.74 </v><v> 1.63 </v>
    

    There is no issue with StAX, if you follow the events you will get in order:

    • startElement ( v )
    • characters ( 2.74 )
    • endElement ( v )
    • startElement ( v )
    • characters ( 1.63 )
    • endElement ( v )

    So it is up to you to handle minimal of context information in your code to know if it is the first or the second time you are starting a <v> element.

    1. The second issue: read the comments

    There is no issue neither, the StAX parsing triggers events for comments as well, you can simply get the comment as String with the API and extract yourself the expected value, for instance:

    XMLInputFactory inputFactory = XMLInputFactory.newInstance();
    XMLStreamReader streamReader = inputFactory.createXMLStreamReader(inputStream);
    while (streamReader.hasNext()) {
        int event = streamReader.next();
        if(event == XMLStreamConstants.COMMENT) {
            String aDateStringVal = streamReader.getText();
            // + extract your date value from the comment string
        }
    }