Search code examples
javaxmlxml-parsingstaxxmlstreamreader

StAX XML all content between two required tags


Starting learning the StAX, using XMLStreamReader, I faced with some problem. How can I get ALL content between tags as Text? I mean, I know name of needed tag, and when I find it, I must go to the close tag, and everything I found between them I must append at some string. For example, we have something like

<rootTag>
...    
    <someTag>
        Some text content and other tags here…
    </someTag >
    <tagINeed>
        <someinternalTag1>
            <someinternalTag11>
                Some text content..
            </someinternalTag11>
            ...
        </someinternalTag1>
        <someinternalTag2>
            Something here
        </someinternalTag2>
    </tagINeed>
...
    <somethingAnother>
...
    </somethingAnother >
...
</rootTag>    

So, I need to get my string as

        <someinternalTag1>
            <someinternalTag11>
                Some text content..
            </someinternalTag11>
            ...
        </someinternalTag1>
        <someinternalTag2>
            Something here
        </someinternalTag2>

How can I get it? Maybe, I must find start and end offsets of needed block in source xml, and give substring after parsing?


Solution

  • Try

        StringWriter sw = new StringWriter();
        XMLOutputFactory of = XMLOutputFactory.newInstance(); 
        XMLEventWriter xw = null;
        XMLInputFactory f = XMLInputFactory.newInstance();
        XMLEventReader xr = f.createXMLEventReader(new FileInputStream("test.xml"));
        while (xr.hasNext()) {
            XMLEvent e = xr.nextEvent();
            if (e.isStartElement()
                    && ((StartElement) e).getName().getLocalPart().equals("tagINeed")) {
                xw = of.createXMLEventWriter(sw);
            } else if (e.isEndElement()
                    && ((EndElement) e).getName().getLocalPart().equals("tagINeed")) {
                break;
            } else if (xw != null) {
                xw.add(e);
            }
        }
        xw.close();
        System.out.println(sw);
    

    prints

        <someinternalTag1>
            <someinternalTag11>
                Some text content..
            </someinternalTag11>
        </someinternalTag1>
        <someinternalTag2>
            Something here
        </someinternalTag2>
    

    Update:

    If you need XML string with too, we can write like that:

            if (e.isStartElement() &&
                    ((StartElement) e).getName().getLocalPart().equals("tagINeed")){
                xw = of.createXMLEventWriter(sw);
                xw.add(e);
            } else if (e.isEndElement() &&
                    ((EndElement) e).getName().getLocalPart().equals("tagINeed")){
                xw.add(e);
                break;
            } else if (xw != null) {
                xw.add(e);
            }